I am new to sqoop.
I am running the below sqoop command to import data from an oracle table
sqoop import --connect jdbc:oracle:thin:<username>/<password>#<IP>:1521:MSDP2 --query "select * from table_name where \$CONDITIONS AND created=TRUNC(TO_DATE('20171101','YYYYMMDD'))" --target-dir /stage/ESM/esm_tmp --hive-table ESM_tab --hive-import -m 1
This is creating an hive table with COMMA delimiter. Since one column contains address of the customer as its value, the field contains comma in it. This is causing the data in the table to be erratic.
while googling i found that we can use the "--fields-terminated-by " option in the sqoop command to specify the delimiter we want. but i dont know where to place it in sqoop command . can somebody help me my placing the command in the right place in the above sqoop command. I prefer a | ( pipe ) delimiter.
You can add --fields-terminated-by '|' anywhere in the command after sqoop import.
You can use anywhere after sqoop import. The best case you can use after the query
--fields-terminated-by '|'
It will work. Please try
Related
I am using UNLOAD in Informix with the DELIMITER "|" but I have problems because in some fields it has as part of the data the value | and then it gives me more fields than the correct ones, I tried with other separators but it's the same.
Is there a better way to export the data than UNLOAD or is there a way to export each field enclosed in double quotes and delimited by "|"?
I am using the following:
UNLOAD TO "/ruta/al/archivo/nombre_tabla.txt' DELIMITER '|'
SELECT * FROM nombre_tabla;
I would like to straight up merge multiple pipe delimited files using Awk. Every example I have found on here is several times more complicated than what i am trying to so. I have several text files formatted identically, and just want to merge them together, like a UNION ALL in SQL. Don't need to join on a column, and don't care about duplicate rows.
Concatenating the files should work for you then:
cat file1.txt file2.txt file3.txt > finalFile.txt
No need for awk.
That is a job for cat (see #mjuarez's answer) but if you really want to use awk for it:
$ awk 1 files* > another_file
(g)awk '{print}' file1 file2 file* >> outputfile
Given that you aim to use awk only.
James' answer is way better,
However I still want to show what I came with, the very basic using of awk. :)
I am searching list of names with pattern "japconfig".There are many files inside one directory. Those files contain names like ixdf_japconfig_FZ.txt,
ixdf_japconfig_AB.txt, ixdf_japconfig_RK.txt, ixdf_japconfig_DK.txt, ixdf_japconfig_LY.txt. But I don't know what are the names present after japconfig word. I need to list down all such names. Also my files contain ixdf_dbconfig.txt, but I don't want to print ixdf_dbconfig.txt in the output.
Each of my file contains one ixdf_japconfig_*.txt and ixdf_dbconfig.txt where * can be FZ,AB,RK,DK,LY. I can achieve my desired result by using grep and then awk to cut the columns.But I don't want to use AWK or other command. I want to achive using grep only.
I need to print below names.
ixdf_japconfig_FZ.txt
ixdf_japconfig_AB.txt
ixdf_japconfig_RK.txt
ixdf_japconfig_DK.txt
ixdf_japconfig_LY.txt
I don't want to print ixdf_dbconfig.txt.
When I tried using "grep -oh "ixdf_japconfig.*.txt" *.dat" command, I am getting below output.
ixdf_japconfig_FZ.txt ixdf_dbconfig.txt
ixdf_japconfig_AB.txt ixdf_dbconfig.txt
ixdf_japconfig_RK.txt ixdf_dbconfig.txt
ixdf_japconfig_DK.txt ixdf_dbconfig.txt
ixdf_japconfig_LY.txt ixdf_dbconfig.txt
where first column is my desired column. But I don't want to print second column. How can I change my code to print only first column?
grep -oh ixdf_japconfig_...txt *.dat
(Your .*. was matching most of the line.)
In a BASH script, I am reading in a list of strings from a text file that may contain apostrophe ('). Each string in the list is saved to a BASH environment variable that is passed to my psql query. I have tried everything so far but still when I loop through the list, if I counter an apostrophe, my query fails.
Here is a snipit of the code that fails:
SELECT * FROM table_1 WHERE id = $myid AND name = '$namelist';
namelist is the file that has the entries which may contain apostrophes.
Thanks for you help
Use a prepared SQL statement to avoid SQL injection.
You may also need a solution from this post.
I want to export several spss custom tables to excel. I want to export just the tables and exclude the syntax. I tried to select all and exclude if, but I am still getting all of the output.
You can export the output with the OMS command. Within this command you can specify which output elements you want to export.
If you want to export just the custom tables, you can run the following command.
OMS /SELECT TABLES
/IF SUBTYPES = 'Custom Table'
/DESTINATION FORMAT = XLSX
OUTFILE = '/mydir/myfile.xlsx'.
... Some CTABLES Commands ...
OMSEND.
Every custom table (generated from CTABLES commands) between OMS and OMSEND will be exported to a single .xlsx file specified by the outfile option.
See the SPSS Command Syntax Reference for more information on the OMS command.
Here is an complete example of Output Management System (OMS) in xlsx with Ctable using SPSS Syntax. Here I have run custom table between Month and A1A variables. I have used VIEWER=NO is OMS Syntax which does not display CTables in SPSS output window but create xlsx output with desired tables.
OMS
/SELECT TABLES
/IF COMMANDS=['CTables'] SUBTYPES=['Custom Table']
/DESTINATION FORMAT=XLSX
OUTFILE ='...\Custom Tables.xlsx'
VIEWER=NO.
CTABLES
/VLABELS VARIABLES=A1A MONTH DISPLAY=LABEL
/TABLE A1A [C] BY MONTH [C][COLPCT.COUNT PCT40.1]
/CATEGORIES VARIABLES=A1A MONTH ORDER=A KEY=VALUE EMPTY=INCLUDE
/SLABELS VISIBLE=NO
/TITLES
TITLE='[UnAided Brand Awareness] A1A TOM.'
CAPTION= ')DATE)TIME'.
OMSEND.
Try something like this, for which you will need the SPSSINC MODIFY OUTPUT extension:
get file="C:\Program Files\IBM\SPSS\Statistics\23\Samples\English\Employee data.sav".
/* Swich printback on to demo how to exclude printback in export */.
set printback on.
ctables /table jobcat[c] /titles title="Table: Job cat".
ctables /table gender[c] /titles title="Table: Gender".
spssinc modify output logs charts headings notes page texts warnings trees model /if process=all /visibility visible=false.
/* Exclude the Custom Table titles */.
spssinc modify output titles /if itemtitle="Custom Tables" process=all /visibility visible=false.
output export
/contents export=visible layers=visible modelviews=printsetting
/xlsx documentfile="C:/Temp/Test.xlsx"
operation=createfile sheet='CTables'
location=lastcolumn notescaptions=yes.
These are good answers, but I wanted to get the simple solution on the record:
Unless there's some reason you need a script (e.g. for automated processes), you can copy and paste the tables straight into excel.
In the output window, right-click on the table, select "copy", and it will paste into Excel without issue.
Another solution is to use some .sps script written by a smart guy named Reynolds, located here:
http://www.spsstools.net/en/scripts/577/
Simply download this as .sps on right hand side of screen and save it out into your SPSS folder. At the end of your ctables syntax you will write this simple 1 line syntax that calls this file and will do all the work for you.
script 'N:\WEB\SPSS19\FILENAME.sps'.
It loops through the output window, deletes all syntax/titles and keeps the ctables right before your eyes. It works very well, saves me lots of time at work.