Text Classification from Chatbot [closed] - machine-learning

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I'm just getting started to be a Junior Data Analyst. I applied to a startup and they give me a test. I wonder if someone can give me a hint on how to solve it. The quest are:
Given the random words below (extracted from chatbot):
Make a classification plan that presents the themes of the spoken sentence
Make a preprocessing data design so that the classification goes well
Make a draft of the selected features so that the classification runs well
It belows are Unstructured Indonesian language:
Makasii ????
Sekarang jam berapa ya?
Hp gua udah 4g gausah diaktifin lagi
Semlekom
Sejag kapan nama saiia blue (Har Har)
Mana gw tau anying
Nggak bisa di klik
ngga udah
no
Saya Mau Complient ...
Terimakasih istriku
Gk sya udh plih
terimakasih :)
halo mau tanya
Assalamualaikumin
Salam
halon
Yaudah lah
mmm
Suka apa
Makasih ya
Gk jdi
sampai jumpa lagi
gak peka
Ga usah
Bodo amat
Senang sekali
Ok maya
Sibuk?
nggak inget
Mending taka kemana mana
Mana nih
Samlekom min
Berhentiin
it hole the Herat break losing I am cry
mau komplain
Hii
can you speak english please?
Nggak muncul -mucul
Ga tau mi
mn Ka gak ada soal nya
hmmm
bosen nih
ok..
kamu suka apa
di mana?
kok namanya
Ga dengar
Saya nanya min
Saya mau tanyq
Sudah selesai
Mau nnya nih
Halo
Males dehh
dah mam?
Gak mau jawab
Termakasih
Alaah
Jam berapa ini?
Anjirrr &gt
Ok terima kasih infonya????
Kwkwkwkw
Namamu siapa?
y
Ribet ah
ya terimakasih
Saya milih dimana
ribet anjay
Ah
Waah menarik niih min
Ah gk jls bicara ama lu
Ga guna
tidak, terimakasih
Gak jelas!!!
Terimakasih
Aduh aku bingung
Mengapa diam saja
Mau nanya dong mi
I love u full
May ku sayang bngt ama kamu
Auw ahh ga bisa saya mahh
Ga ada menu nya mas
Kamu sotta deh
menu mana
Salah klik maya ,harusnya lainnya
sudah makan?
Masntap
Malah milih produk
kenapa namanya
Saya mau tanya ?
Gue udah tau
Bikin kesel ja
Makasih kak
Assalamualaiku wr wb...
Tetap gak bisa
Mbak
kok diam
apa ya
menunya mana
Makasiih
testing
capek deh
bosen banget
kok gitu?
tks
Tak Nyaman
asslamualaikum
terima kasih :)
Mas/mba saya mau tanya
Pgi
Kok ga muncul apa2?
lahir tahun berapa
Okay terimakasih
Sudah kak
Kurang mengerti
Tidaak
Ok dude
ok bye
Permisi mau tanya
gimana ya
Sampun cot
Makan Bang
Yeay di bls :v
alhamdullilah
mau tanya bisa ?
Assalamualaikumin
permisi mau tanya
Terima kasih
pertanyaan yg membingungkan
Rumahnya di mana?
Dasar boot
Mana menu?
wokee
eh... udah dibilang enggak
Baru sibuk?
Sementara belum. Tq
thanks ya
Pilih yg mna
bosan saya
uhhh
Gak nanya lu maya
mana jawabanya
Hobimu apa
Thanks (love)
Apa kabar?
Meong
nanya berulang ulang
LEMOT EUY
Sumpah ngeseli
Ribet dahh
Hy jg
Barusan Sdh bsa min.. mksh
Ga nyambung jawaban nya
test
Ga jelas bnykn tnya
Saya mau nanya boleh??
Ok thank you
Lanjut Book
Tinggal di mana?
ihhh
Belum ada...
Ngomonge bae cepet
Hei
LoL
thx u
Banyak tanya
Hahahahhahahahaha
Ouy
Sudah, terima kasih
Belom saya cek
Mau tidur ama ayam
Ok mksh
Apaansih&gt
Jaringannya cepet banget dah
yng mana
okei
Okeyv
Tidak penting
Maksudnya gimana ini?
Gausah jwb berbelit2 dah
yaudahlah gpp ttp cantik ko (love)
ok makasih
Mau nanya donk
Lagi sibuk?
selesai.
Kagak nannya namalu
Saya mau menyampaikan keluhan
min di sini sinyalnya cepet banget
Coyyy
Hmmm -,-
jjkakak
Bisa bahasa apa saja?
Gai
Lagi ngapain?
Ga sekolah lu ua
Umur kamu berapa?
Tydak
Ga pernah bgus
sangat baik sesuai program komputer...
link itu gak bisa di klik
Saya ada masalah...
Ra iso nuw kok..apus2
Mana menunya yh kak
Hadeeeeuh
sekarang jaringan nya bagus loh
Auk ahh..
Aku gk tau
Syaa mau tanya
serius?
MAU KOMPLAIN!!
Ga bisa mbak
Ga ah nanti kamu genit
Kalau gini terus merugikan orang
Unchhhh
Gatau diuntung!!!
Wkwkwkwkwkwk
Saya juga kurang mengerti.
Aku nanya s mba nya jomblo ngga ??
Mana yg dipilih?
Mana? :v
besok ya
Arigato
Hobby nya apa
mana? gaada?
GO
Makasih sayang
Jelek
Cape lah kuya
Siapa kamu?
Saya ingin menyamoaikan keluhan
mana menunya woy?
Gk tau anjeng
Msh aman
Gk tau
Maaf kepencet
saya ingin bertanya
Ribet cukk gue lgi sibuk &gt
Love you may
Menu mna?
uyyy
mlm
Tetimakasih
Kwlwkwkwkkwkwkw
Biasanya g Kay gini
eh
wah menarik ini
met bobo
Left bentar ya
Ngga mau
english?
sedang apa
sekarang sinyalnya udah lumayan bagus
Yaiyalah
Mksdnya apa si
Samlikum min
Masih gak bisa
Boleh nanya ?
mana ya
Saya punya pernasalahan
Kecewa ini saya
umurnya berapa?
Mau tanya bisa
Wah gk jlas
Sudah ada.
Mana ada menu
Bapak kau
Mana menu nya njit
Thanks
Jaringan bagus sekali ya
Apasih ini ga jelas
Ndak tau
Mungkin lain kali
ach ribet amat
Hu
Makasiiihh
Ok!
Min -_- lu sehat?
Gausah ngalem jawab aja kenapa
Clear
menu yang mana ya
Ribet!
nnti ae
bagus ini
Cepet amat respon nya
oke anjir
Maaf ga bisa d klik
kamu di mana
Kenapa kok maya belum paham sij
Saya kurang maksud atas pertanyaan anda
Zzz
Lagi banyak kerjaan?
Misi min mau tanya
test 1 2 3
Bodoamayttt
Tidak ada pilihan
if lu okay illu
Salah chat kwkwkkw
maksudnya apa ya
Tidak.
Rugi gw belinya
Okee siaap
Min maksudnya
MENU MANA SAJA?
Hi
Thanks sangat ya min
sangat tidak membantu !!!
OK selesai
nggak bisa min
Kgk mau gua
Kami selalu mendukung anda
hadeuhh ribet
fa cai :)
Terimakasi
bingung
Tanks kakak
Mau tanya.bisa?
kalian luar biasa
Detaya :(
Ah sudahlah
Baru apa nih
yaiyalah
Sudah benar
Di mana nih?
menu tidak ada?
SAYA KOMPLAIN
Lagi apa?
Gak
Kok sy ga bisa pilih apa2?
Gak ada. Terima Kasih!
masa sih?
Ah ga ad gunanya sistem bgni
Saya sedang bertanya
waduh
Hi
gak doyan
Sementara 7 dulu
Jangan muter muter terus mba
Mau nanya ??
Kgk bisa
pacarnya siapa?
Hi hi hi
Mana menunya? Ga muncul, ampun deh
Kagak bisa di pake hhhhhh percuma
Gak berguna
Permisi saya mau nanya
Can u speak english pls
Sudah makan belum?
Aduuhh ribet
Menu nya yang mana
super
gila lu
Gimana kabarnya?
Uda ku isi pak
tes
Ya sudah lah ...
Saya cinta kami
Udah punya pacar?
Sudah tadi.. makasih
Gitu Aja Terus
Oh begitu ya. Terimakasih
Okay terima kasih..
Assalamu allaikum
sialan,,pgn nanya aja dipersulit gini
yang bener?
met tidur
Sedang apa?
dibilang mau logout
Entahlah
Umurmu berapa?
Rumah kamu dimana
sinyal nya bagus
yah -,-
Singkirin bot ini ah
Baru di mana?
Mksudnya gimana sih
Kog menu nya tdk bisa di klik?
hm
Layanan internet selalu cepat
Terima kasih min :)
Kamu siapa?
Saya udah gak ngerti.
Gua mau tdurrr, mlm
Gg tau
Gk butuh informasi produk!
parah ini robot
Makasih ka
Punya anjir. Bacod mulu dari tadi
Terima kasih atas infonya bak
Can you speak English ?
Namanya siapa?
Mana gua tau selesai
What algorithm do I need to complete those 3 tasks?

Here you need to formulate your problem using the data(if it is provided with test). Identify different themes in data, I do not speak Indonesian so I cannot help with it. But the classification problem could be sentiment analysis, emotion classification etc. In your particular case, I think problem would be different. It will be helpful if you could add the translation of these sentences to your question.
You need to preprocess the data so that classifier can perform better classification. In case of English, preprocessing usually involves removal of stop words, lemmatization, removal of any noisy or irrelevant data etc.
Select features that you think will be useful for classification. Those features could be the presence of question mark etc. Generally, word embeddings are used for the classification of textual data But the test requires you to select features, so you need to think of some according to your classification problem.
Edit:
Once you formulate your classification problem, you can identify the preprocessing steps and features easily and you can apply any classification algorithm such as Decision tree, Random forest, Neural network for classification task.

So what I did is:
Put those words as csv and load it to list
with open('/content/test.csv') as f:
content = f.readlines()
# you may also want to remove whitespace characters like `\n` at the end of each line
content = [x.strip() for x in content]
Throw it to KMeans clustering
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.cluster import KMeans
import numpy as np
import pandas as pd
vectorizer = TfidfVectorizer()
X = vectorizer.fit_transform(content)
true_k = 10
model = KMeans(n_clusters=true_k, init='k-means++', max_iter=100, n_init=1)
model.fit(X)
order_centroids = model.cluster_centers_.argsort()[:, ::-1]
terms = vectorizer.get_feature_names()
for i in range(true_k):
print('Cluster %d:' % i),
for ind in order_centroids[i, :10]:
print(' %s' % terms[ind])
Test it
print("Prediction")
X = vectorizer.transform(['makasih istriku'])
predicted = model.predict(X)[0]
print(format(predicted))
Prediction
5
Is it satisfied the quest?

Related

How add paragraph in Latex file

I want to add a paragraph section to the Summary section in this Latex template.
I've tried using \paragraph, but I am not able to get the results I want. I'm very new to Latex so my understanding of \newcommand is a little confusing to me.
Here is the template src:
\documentclass[letterpaper,11pt]{article}
\usepackage{latexsym}
\usepackage[empty]{fullpage}
\usepackage{titlesec}
\usepackage{marvosym}
\usepackage[usenames,dvipsnames]{color}
\usepackage{verbatim}
\usepackage{enumitem}
\usepackage[hidelinks]{hyperref}
\usepackage{fancyhdr}
\usepackage[english]{babel}
\usepackage{tabularx}
\input{glyphtounicode}
%----------FONT OPTIONS----------
% sans-serif
% \usepackage[sfdefault]{FiraSans}
% \usepackage[sfdefault]{roboto}
% \usepackage[sfdefault]{noto-sans}
% \usepackage[default]{sourcesanspro}
% serif
% \usepackage{CormorantGaramond}
% \usepackage{charter}
\pagestyle{fancy}
\fancyhf{} % clear all header and footer fields
\fancyfoot{}
\renewcommand{\headrulewidth}{0pt}
\renewcommand{\footrulewidth}{0pt}
% Adjust margins
\addtolength{\oddsidemargin}{-0.5in}
\addtolength{\evensidemargin}{-0.5in}
\addtolength{\textwidth}{1in}
\addtolength{\topmargin}{-.5in}
\addtolength{\textheight}{1.0in}
\urlstyle{same}
\raggedbottom
\raggedright
\setlength{\tabcolsep}{0in}
% Sections formatting
\titleformat{\section}{
\vspace{-4pt}\scshape\raggedright\large
}{}{0em}{}[\color{black}\titlerule \vspace{-5pt}]
% Ensure that generate pdf is machine readable/ATS parsable
\pdfgentounicode=1
%-------------------------
% Custom commands
\newcommand{\resumeItem}[1]{
\item\small{
{#1 \vspace{-2pt}}
}
}
\newcommand{\resumeSubheading}[4]{
\vspace{-2pt}\item
\begin{tabular*}{0.97\textwidth}[t]{l#{\extracolsep{\fill}}r}
\textbf{#1} & #2 \\
\textit{\small#3} & \textit{\small #4} \\
\end{tabular*}\vspace{-7pt}
}
\newcommand{\resumeSubSubheading}[2]{
\item
\begin{tabular*}{0.97\textwidth}{l#{\extracolsep{\fill}}r}
\textit{\small#1} & \textit{\small #2} \\
\end{tabular*}\vspace{-7pt}
}
\newcommand{\resumeProjectHeading}[2]{
\item
\begin{tabular*}{0.97\textwidth}{l#{\extracolsep{\fill}}r}
\small#1 & #2 \\
\end{tabular*}\vspace{-7pt}
}
\newcommand{\resumeSubItem}[1]{\resumeItem{#1}\vspace{-4pt}}
\renewcommand\labelitemii{$\vcenter{\hbox{\tiny$\bullet$}}$}
\newcommand{\resumeSubHeadingListStart}{\begin{itemize}[leftmargin=0.15in, label={}]}
\newcommand{\resumeSubHeadingListEnd}{\end{itemize}}
\newcommand{\resumeItemListStart}{\begin{itemize}}
\newcommand{\resumeItemListEnd}{\end{itemize}\vspace{-5pt}}
%-------------------------------------------
%%%%%% RESUME STARTS HERE %%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{document}
%----------HEADING----------
% \begin{tabular*}{\textwidth}{l#{\extracolsep{\fill}}r}
% \textbf{\href{http://sourabhbajaj.com/}{\Large Sourabh Bajaj}} & Email : \ href{mailto:sourabh#sourabhbajaj.com}{sourabh#sourabhbajaj.com}\\
% \href{http://sourabhbajaj.com/}{http://www.sourabhbajaj.com} & Mobile : +1-123-456-7890 \\
% \end{tabular*}
\begin{center}
\textbf{\Huge \scshape Name} \\ \vspace{1pt}
\small 734-548-4835 $|$ \href{email}{\underline{email}} $|$
\href{}{\underline{linkedin}} $|$
\href{}{\underline{github}}
\end{center}
%-----------EDUCATION-----------
\section{Summary}
\resumeSubHeadingListStart
\resumeSubheading
\resumeSubheading
\resumeSubHeadingListEnd
%-----------EXPERIENCE-----------
\section{Experience}
\resumeSubHeadingListStart
\resumeSubheading
{Undergraduate Research Assistant}{June 2020 -- Present}
{Texas A\&M University}{College Station, TX}
\resumeItemListStart
\resumeItem{Developed a REST API using FastAPI and PostgreSQL to store data from learning management systems}
\resumeItem{Developed a full-stack web application using Flask, React, PostgreSQL and Docker to analyze GitHub data}
\resumeItem{Explored ways to visualize GitHub collaboration in a classroom setting}
\resumeItemListEnd
% -----------Multiple Positions Heading-----------
% \resumeSubSubheading
% {Software Engineer I}{Oct 2014 - Sep 2016}
% \resumeItemListStart
% \resumeItem{Apache Beam}
% {Apache Beam is a unified model for defining both batch and streaming data-parallel processing pipelines}
% \resumeItemListEnd
% \resumeSubHeadingListEnd
%-------------------------------------------
\resumeSubheading
{Information Technology Support Specialist}{Sep. 2018 -- Present}
{Southwestern University}{Georgetown, TX}
\resumeItemListStart
\resumeItem{Communicate with managers to set up campus computers used on campus}
\resumeItem{Assess and troubleshoot computer problems brought by students, faculty and staff}
\resumeItem{Maintain upkeep of computers, classroom equipment, and 200 printers across campus}
\resumeItemListEnd
\resumeSubheading
{Artificial Intelligence Research Assistant}{May 2019 -- July 2019}
{Southwestern University}{Georgetown, TX}
\resumeItemListStart
\resumeItem{Explored methods to generate video game dungeons based off of \emph{The Legend of Zelda}}
\resumeItem{Developed a game in Java to test the generated dungeons}
\resumeItem{Contributed 50K+ lines of code to an established codebase via Git}
\resumeItem{Conducted a human subject study to determine which video game dungeon generation technique is enjoyable}
\resumeItem{Wrote an 8-page paper and gave multiple presentations on-campus}
\resumeItem{Presented virtually to the World Conference on Computational Intelligence}
\resumeItemListEnd
\resumeSubHeadingListEnd
%-----------PROJECTS-----------
\section{Projects}
\resumeSubHeadingListStart
\resumeProjectHeading
{\textbf{Gitlytics} $|$ \emph{Python, Flask, React, PostgreSQL, Docker}}{June 2020 -- Present}
\resumeItemListStart
\resumeItem{Developed a full-stack web application using with Flask serving a REST API with React as the frontend}
\resumeItem{Implemented GitHub OAuth to get data from user’s repositories}
\resumeItem{Visualized GitHub data to show collaboration}
\resumeItem{Used Celery and Redis for asynchronous tasks}
\resumeItemListEnd
\resumeProjectHeading
{\textbf{Simple Paintball} $|$ \emph{Spigot API, Java, Maven, TravisCI, Git}}{May 2018 -- May 2020}
\resumeItemListStart
\resumeItem{Developed a Minecraft server plugin to entertain kids during free time for a previous job}
\resumeItem{Published plugin to websites gaining 2K+ downloads and an average 4.5/5-star review}
\resumeItem{Implemented continuous delivery using TravisCI to build the plugin upon new a release}
\resumeItem{Collaborated with Minecraft server administrators to suggest features and get feedback about the plugin}
\resumeItemListEnd
\resumeSubHeadingListEnd
%
%-----------PROGRAMMING SKILLS-----------
\section{Technical Skills}
\begin{itemize}[leftmargin=0.15in, label={}]
\small{\item{
\textbf{Languages}{: Java, Python, C/C++, SQL (Postgres), JavaScript, HTML/CSS, R} \\
\textbf{Frameworks}{: React, Node.js, Flask, JUnit, WordPress, Material-UI, FastAPI} \\
\textbf{Developer Tools}{: Git, Docker, TravisCI, Google Cloud Platform, VS Code, Visual Studio, PyCharm, IntelliJ, Eclipse} \\
\textbf{Libraries}{: pandas, NumPy, Matplotlib}
}}
\end{itemize}
%-------------------------------------------
\end{document}

Fix spacing on Overleaf with LaTex

I am working on my resume and am using LaTex in Overleaf. I am having issue with spacing everything properly. Here is my LaTex.
%-------------------------
% Resume in Latex
% Author : Sidratul Muntaha Ahmed
% License : MIT
%------------------------
\documentclass[letterpaper,12pt]{article}
\usepackage{latexsym}
\usepackage[empty]{fullpage}
\usepackage{titlesec}
\usepackage{marvosym}
\usepackage[usenames,dvipsnames]{color}
\usepackage{verbatim}
\usepackage{enumitem}
\usepackage[hidelinks]{hyperref}
\usepackage{fancyhdr}
\usepackage[english]{babel}
\usepackage{tabularx}
\input{glyphtounicode}
%----------FONT OPTIONS----------
% sans-serif
% \usepackage[sfdefault]{FiraSans}
% \usepackage[sfdefault]{roboto}
% \usepackage[sfdefault]{noto-sans}
% \usepackage[default]{sourcesanspro}
% serif
% \usepackage{CormorantGaramond}
% \usepackage{charter}
\pagestyle{fancy}
\fancyhf{} % clear all header and footer fields
\fancyfoot{}
\renewcommand{\headrulewidth}{0pt}
\renewcommand{\footrulewidth}{0pt}
% Adjust margins
\addtolength{\oddsidemargin}{-0.5in}
\addtolength{\evensidemargin}{-0.5in}
\addtolength{\textwidth}{1in}
\addtolength{\topmargin}{-.5in}
\addtolength{\textheight}{1.0in}
\urlstyle{same}
\raggedbottom
\raggedright
\setlength{\tabcolsep}{0in}
% Sections formatting
\titleformat{\section}{
\vspace{-4pt}\scshape\raggedright\large
}{}{0em}{}[\color{black}\titlerule \vspace{1pt}]
% Ensure that generate pdf is machine readable/ATS parsable
\pdfgentounicode=1
%-------------------------
% Custom commands
\newcommand{\resumeItem}[1]{
\item\small{
{#1 \vspace{-6pt}}
}
}
\newcommand{\resumeSubheading}[4]{
\vspace{-9.5pt}\item
\begin{tabular*}{0.97\textwidth}[t]{l#{\extracolsep{\fill}}r}
\textbf{#1} & #2 \\
\textit{\small#3} & \textit{\small #4} \\
\end{tabular*}\vspace{-10pt}
}
\newcommand{\resumeSubSubheading}[2]{
\item
\begin{tabular*}{0.97\textwidth}{l#{\extracolsep{\fill}}r}
\textit{\small#1} & \textit{\small #2} \\
\end{tabular*}\vspace{-10pt}
}
\newcommand{\resumeProjectHeading}[2]{
\item
\begin{tabular*}{0.97\textwidth}{l#{\extracolsep{\fill}}r}
\small#1 & #2 \\
\end{tabular*}\vspace{-10pt}
}
\newcommand{\resumeSubItem}[1]{\resumeItem{#1}\vspace{-5pt}}
\renewcommand\labelitemii{$\vcenter{\hbox{\tiny$\bullet$}}$}
\newcommand{\resumeSubHeadingListStart}{\begin{itemize}[leftmargin=0.15in, label={}]}
\newcommand{\resumeSubHeadingListEnd}{\end{itemize}}
\newcommand{\resumeItemListStart}{\begin{itemize}}
\newcommand{\resumeItemListEnd}{\end{itemize}\vspace{-5pt}}
%-------------------------------------------
%%%%%% RESUME STARTS HERE %%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{document}
%----------HEADING----------
\begin{center}
\textbf{\Huge \scshape Brad Bieselin} \\ \vspace{1pt}
\textbf{\small \scshape 516-557-8407 $|$ Wantagh, NY} \\ \vspace{1pt}
\href{https://bradbieselin.com/}{\underline{bradbieselin.com}} $|$ \href{mailto:brad#bradbieselin.com}{\underline{brad#bradbieselin.com}} $|$
\href{https://linkedin.com/in/bradbieselin}{\underline{linkedin.com/in/bradbieselin}} $|$
\href{https://github.com/bradbieselin}{\underline{github.com/bradbieselin}}
\end{center}
%-----------Experience-----------
\section{Experience}
\resumeSubHeadingListStart
\resumeSubheading
{Witbe Inc}{New York, NY}
{Project Manager/Customer Success Manager}{June 2021-Present}
\resumeItemListStart
\resumeItem{Led and managed 9 projects for customers including Verizon, ViacomCBS, and Peacock by automating VoD asset testing, Channel Change testing, and video/audio quality analysis using Witbe robots}
\resumeItem{Prepared and facilitated 8+ hour interactive training sessions for all customers, including 10+ senior executives on delivery of new products and software}
\resumeItem{Planned, scheduled and executed all software, hardware, and system integration with over 100 Witbe robots and customer devices to ensure the best Quality of Experience monitoring}
\resumeItem{Established growth and generated improvements to customer business by over 40\% by managing automated scripting and monitoring projects, and finding opportunities for improvement}
\resumeItem{Stimulated progress of each project to meet deadlines and standards with 10+ check-ins per week and constant improvemenets to automation scripts}
\resumeItemListEnd
\resumeSubHeadingListEnd
\resumeSubHeadingListStart
\resumeSubheading
{Apple Inc}{Remote}
{Software Quality Engineer}{February 2021-June 2021}
\resumeItemListStart
\resumeItem{Developed projects in Java using Jackson, Jenkins, TestNG and Gradle which analyzed and validated Apple.com product prices when adding products to the cart, testing over 3000 scenarios}
\resumeItem{Engineered a Full-Stack project which allows the upload and download of files, built with a React.js and Spring Boot backend, and using Amazon S3 cloud storage to store files}
\resumeItem{Gained fluency in AWS with 40+ hours of training and obtained AWS Cloud Practitioner Certification}
\resumeItem{Collaborated with development engineers for code review regularly to propose improvements}
\resumeItem{Collected data, devised Confluence reports, and presented reports during weekly team meetings which highlighted progress of software quality testing projects}
\resumeItemListEnd
\resumeSubHeadingListEnd
\resumeSubHeadingListStart
\resumeSubheading
{Frontend Web Developer}{}
{Freelance}{January 2021-Present}
\resumeItemListStart
\resumeItem{Coding web applications and integrating CMS driven data, such as Contentful}
\resumeItem{Designing and creating user interfaces with modern frameworks including React.js and Next.js}
\resumeItem{Completed FrontendMasters Bootcamp and 40+ hours of additional courses including Functional Javascript, Advanced CSS, Computer Science, and Website Accessibility}
\resumeItemListEnd
\resumeSubHeadingListEnd
%-----------Education-----------
\section{Projects}
\resumeSubHeadingListStart
\resumeProjectHeading
{\href{https://studentprofiles.bradbieselin.com/}{\underline{Student Profiles}} $|$ \emph{React.js/Firebase}}{}
\resumeItemListStart
\resumeItem{Developed a website which lets users filter student profiles by name or tag and see test scores driven by an API, using React.js for the frontend}
\resumeItemListEnd
\resumeSubHeadingListEnd
\resumeSubHeadingListStart
\resumeProjectHeading
{\href{https://bitcointracker.bradbieselin.com/}{\underline{Bitcoin Tracker}} $|$ \emph{Javascript/Firebase}}{}
\resumeItemListStart
\resumeItem{An animated sunset built with Javascript, CSS, and HTML that tracks the live price of Bitcoin using the Binance.com websocket}
\resumeItemListEnd
\resumeSubHeadingListEnd
%
%-----------Education-----------
\section{Education}
\resumeSubHeadingListStart
\resumeProjectHeading
{\textbf{University at Albany, SUNY} \emph{}}{May 2015-December 2019}
\resumeItemListStart
\resumeItem{Bachelor of Science in Computer Science}
\resumeItem{Awarded from the College of Engineering and Applied Science}
\resumeItemListEnd
\resumeSubHeadingListEnd
%
%-----------SKILLS-----------
\section{Skills}
\begin{itemize}[leftmargin=0.15in, label={}]
\small{\item{
\textbf{Languages}{: Javascript, TypeScript, CSS, HTML, Java, Python, C, LaTex, PowerShell, Bash} \\
\textbf{Web Skills}{: WebPack, Babel, Tailwind, Styled Components, SEO, Contentful CMS, Amazon Web Services} \\
\textbf{React Skills}{: Next, Gatsby, Material/Semantic UI} \\
\textbf{Technologies}{: Git, WSL 2, VS Code/Studio, IntelliJ, Node.js, Express, Spring Boot} \\
}}
\end{itemize}
%-------------------------------------------
\end{document}
I can't get the spacing to work properly. For my projects, education and skills section there is a lot of empty space and I want everything to be much closer together. Does anyone have any suggestions? Thank you!
Here is an example of the spacing I am looking to achieve:
Here is what the section I am having trouble with looks like:

How can I leftalign each column of a table? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 3 years ago.
Improve this question
I want to describe a number of variables in an equation and format it in a certain way. I don't know how to do that.
I have tried using the \begin{table} environment together with \begin{flushleft} and \begin{tabular}.
Also, I tried using \begin{align*} and \begin{flalign*}.
\begin{flalign*}
&C_R&\text{CO\textsubscript{2}-Masse pro Luftvolumen im Büro}&[\frac{g}{m^3}]\\
&C_{Amb}&\text{CO\textsubscript{2}-Masse pro Luftvolumen in der Umgebung}&[\frac{g}{m^3}]\\
&C_{Corr}&\text{CO\textsubscript{2}-Masse pro Luftvolumen im Flur}&[\frac{g}{m^3}]\\
&C_{ProdPP}&\text{CO\textsubscript{2}-Masse-Produktion pro Luftvolumen pro Person}&[\frac{g}{min}]\\
&\dot{m}_{airx}&\text{Luftmassenstrom vom Büro in die Umgebung und benachbarte Räume}&[\frac{kg}{min}]\\
&\dot{m}_{Amb}&\text{Luftmassenstrom von der Umgebung in das Büro}&[\frac{kg}{min}]\\
&\dot{m}_{Corr}&\text{Luftmassenstrom vom Flur in das Büro}&[\frac{kg}{min}]\\
&\rho_{air}&\text{Dichte der Luft}&[\frac{kg}{m^3}]\\
&V_{office}&\text{Volumen des Büros}&[m^3]\\
&\Delta t&\text{Zeitschritt}&[min]\\
&n_{OCC,i}&\text{Anzahl der anwesenden Personen}&[-]\\
\end{flalign*}
\begin{table}
\begin{flushleft}
\begin{tabular}{c c c}
$C_R$&\text{CO\textsubscript{2}-Masse pro Luftvolumen im Büro}&[$\frac{g}{m^3}$]\\
$C_{Amb}$&\text{CO\textsubscript{2}-Masse pro Luftvolumen in der Umgebung}&[$\frac{g}{m^3}$]\\
$C_{Corr}$&\text{CO\textsubscript{2}-Masse pro Luftvolumen im Flur}&[$\frac{g}{m^3}$]\\
$C_{ProdPP}$&\text{CO\textsubscript{2}-Masse-Produktion pro Luftvolumen pro Person}&[$\frac{g}{min}$]\\
$\dot{m}_{airx}$&\text{Luftmassenstrom vom Büro in die Umgebung und benachbarte Räume}&[$\frac{kg}{min}$]\\
$\dot{m}_{Amb}$&\text{Luftmassenstrom von der Umgebung in das Büro}&[$\frac{kg}{min}$]\\
$\dot{m}_{Corr}$&\text{Luftmassenstrom vom Flur in das Büro}&[$\frac{kg}{min}$]\\
$\rho_{air}$&\text{Dichte der Luft}&[$\frac{kg}{m^3}$]\\
$V_{office}$&\text{Volumen des Büros}&[$m^3$]\\
$\Delta t$&\text{Zeitschritt}&[$min$]\\
% $n_{OCC,i}$&\text{Anzahl der anwesenden Personen}&[$-$]\\
\end{tabular}
\end{flushleft}
\end{table}
This is the current result I get:
Here, the middle column is right-aligned.
But I want each column to be aligned to the left.
If the columns should be left aligned, I suggest to simply use l columns instead of c columns:
\documentclass{article}
\usepackage{mathtools}
\begin{document}
\begin{table}
%\begin{flushleft}
\begin{tabular}{lll}
$C_R$&\text{CO\textsubscript{2}-Masse pro Luftvolumen im Büro}&[$\frac{g}{m^3}$]\\
$C_{Amb}$&\text{CO\textsubscript{2}-Masse pro Luftvolumen in der Umgebung}&[$\frac{g}{m^3}$]\\
$C_{Corr}$&\text{CO\textsubscript{2}-Masse pro Luftvolumen im Flur}&[$\frac{g}{m^3}$]\\
$C_{ProdPP}$&\text{CO\textsubscript{2}-Masse-Produktion pro Luftvolumen pro Person}&[$\frac{g}{min}$]\\
$\dot{m}_{airx}$&\text{Luftmassenstrom vom Büro in die Umgebung und benachbarte Räume}&[$\frac{kg}{min}$]\\
$\dot{m}_{Amb}$&\text{Luftmassenstrom von der Umgebung in das Büro}&[$\frac{kg}{min}$]\\
$\dot{m}_{Corr}$&\text{Luftmassenstrom vom Flur in das Büro}&[$\frac{kg}{min}$]\\
$\rho_{air}$&\text{Dichte der Luft}&[$\frac{kg}{m^3}$]\\
$V_{office}$&\text{Volumen des Büros}&[$m^3$]\\
$\Delta t$&\text{Zeitschritt}&[$min$]\\
% $n_{OCC,i}$&\text{Anzahl der anwesenden Personen}&[$-$]\\
\end{tabular}
%\end{flushleft}
\end{table}
\end{document}
(unrelated to the problem: units should be set upright and math mode should not be used for multi-letter words like min, office etc. It might be a good idea to have look at the siunitx package to make it easier to typeset units correctly)

Data Visualization & Machine Learning

Preprocess the data and see the results after and before preprocessing(Report as accuracy)
Draw the following charts:
Corelation chart Heatmap chart
Missing Values Heatmap chart
Line chart/ scatter chart for Country Vs Purchased, Age Vs Purchased and Salary Vs Purchased
Country Age Salary Purchased
France 44 72000 No
Spain 27 48000 Yes
Germany 30 54000 No
Spain 38 61000 No
Germany 40 Yes
France 35 58000 Yes
Spain 52000 No
France 48 79000 Yes
Germany 50 83000 No
France 37 Yes
France 18888 No
Spain 17 67890 Yes
Germany 12000 No
Spain 38 98888 No
Germany 50 Yes
France 35 58000 Yes
Spain 12345 No
France 23 Yes
Germany 55 78456 No
France 43215 Yes
Sometimes it's hard to understand from scatter plot like Country vs Purchased. Three country of your list somehow purhcased. It can be helpful to do heatmap here
import pandas as pd
from matplotlib import pyplot as plt
#read csv using panda
df = pd.read_csv('Data.csv')
copydf = df
#before data preprocessing
print(copydf)
#fill nan value with average of age and salary
df['Age'] = df['Age'].fillna(df['Age'].mean(axis=0))
df['Salary '] = df['Salary'].fillna(df['Salary'].mean(axis=0))
#after data preprocessing
print(df)
plt.figure(1)
# Country Vs Purchased
plt.subplot(221)
plt.scatter(df['Country'], df['Purchased'])
plt.title('Country vs Purchased')
plt.grid(True)
# Age Vs Purchased
plt.subplot(222)
plt.scatter(df['Age'], df['Purchased'])
plt.title('Age vs Purchased')
plt.grid(True)
# Salary Vs Purchased
plt.subplot(223)
plt.scatter(df['Salary'], df['Purchased'])
plt.title('Salary vs Purchased')
plt.grid(True)
plt.subplots_adjust(top=0.92, bottom=0.08, left=0.10, right=0.95, hspace=0.75,
wspace=0.5)
plt.show()

How do I correct a bad sentence alignment (parallel corpus)?

I have a parallel corpus (pt-en) but it has some alignment mistakes. Usually because portuguese sentences are longer and when a translation is done by humans (gold standart) they cut the sentences in half for better understanding. This doesn't happen very offten but I need a perfect alignement to train my machine.
Example: (1 - 1);(2, 3 - 2);(4 - 3);(5 - 4)
EN
1 - Once the individuals at high risk for events are identified, we propose that they be treated according to the current prevention guidelines, mainly in regard to the use of statins and acetylsalicylic acid2.
2 - We suggest that further studies approach the determination of coronary artery calcium scores in women, assess a greater number of males at more advanced ages and of different social classes than that of the population studied (middle-class individuals).
3 - We also suggest that the coronary calcium scores reported for Brazilian populations should be compared with those of populations in the USA, because these are the patterns available in the literature.
4 - If differences are found, a prospective study about the value of coronary artery calcification in our population should be carried out.
5 - Our study is a starting point.
PT
1 - Uma vez identificados os indivíduos sob risco elevado de eventos, propomos que estes sejam tratados conforme as diretrizes atuais de prevenção, principalmente, no que se refere ao uso de estatinas e ácido acetilsalicílico2.
2 - Para seguimento de nosso trabalho, tornam-se necessários determinações em mulheres, número maior de homens em idades mais avançadas e de classes sociais diferentes da população estudada (indivíduos de classe média), comparação dos escores de cálcio coronariano descritos em populações brasileiras aos dos EUA, já que esses são os padrões disponíveis na literatura.
3 - Caso haja diferenças, seria importante um estudo prospectivo do valor da calcificação arterial coronariana em nossa população.
4 - Nosso estudo é um início para esse processo.
Thanks in advance!

Resources