Too large to import - google-sheets

I am working on a case study to analyze bike-share travel data. My plan was to download the previous 12 months of trip data here.
*Specifically the first 12 .csv files on this page.
I first unzipped the files of interest. When I go to google sheets to file-import-upload and select my .csv files I encountered a problem. Some of the months were too big to import.
For example, the 202103-divvy-tripdata.csv has a size of 42,535 KB and it worked for me (as did all the .csv files with sizes below).
The 202011-divvy-tripdata.csv has a size of 45,121 KB and it didn't work (as with all those greater). Is ~45,000 KB the max?
Each file has the same number of columns (13) but with a different amount of rows. The 202103 file that seemed to be the limit had ~228,500 rows.
I appreciate any feedback. I am new to data analysis. Thank you.

This problem your facing is because of the row limit in the sheets. You can upload or import your file by splitting it. https://www.splitcsv.com Here you can split the large csv files and can upload them on sheets or excel.

Solution
You may want to break a file into smaller parts and import them separately at the same spreadsheet OR at a different one.
Explanation
There is a limit of file sizes you can store in Google Drive:
​Up to 10 million cells or 18,278 columns (column ZZZ) for
spreadsheets that are created in or converted to Google Sheets.[...] The limits are the same for Excel and CSV imports.
When you convert a document from Excel to Google Sheets, any cell with
more than 50,000 characters will be removed in Sheets.
I was importing multiple CSV files with 60,000+ rows and 7 columns to the same spreadsheet; a couple of them worked, but eventually I faced this error. I just learned that I could import it to a new spreadsheet and continue my work.
So, the limit is not only at the source file to be imported, but also at the destination spreadsheet.

If this doesn't have to be an online spreadsheet, you can take a look at GS-Calc:
https://citadel5.com/gs-calc.htm
It opens csv files with up to 12mln rows. The free trial version is a 4MB setup file (which expands to ca 7MB after the installation and you can install it on any portable usb device). You can test the speed and memory usage with your files. (Which should be more or less instant for files of that size.)
You can also open non-csv text files, with any cell separators (including nulls) and without cell separators (that is, with fixed column widths).
Regarding analyzing, I believe you have all what's in the spreadsheets mentioned above, including pivot tables and any type of filtering.

Related

Debugging a System.IO.Package out that won't open in Excel, possibly content_type.xml issue?

I'm trying to create a 2007+ Excel document (.xlsx), using nothing but VB.net and .net4.0 built-in classes. So I can't use any third party libraries, including Open XML SDK. The code already works fine for basic workbooks, multiple sheets, styles, etc. The resulting files open fine in Excel and calc etc.
Now I am trying to add pivotTables. They are a bit weird, as they are spread out in multiple files. I have created the Cache and Table in their proper "folders", added the pointer from workbook to the cache and the various related _rels entries... everything seems OK but Excel always reports "... found unreadable content in..."
The only obvious issue I can see is that the content_types.xml file in the resulting Package does not have an Override for the pivotTable. It does have one for the Cache, which I do not do in code, so I'm assuming Packaging has noticed the Cache and added a line here. I'm not clear on why the pivotTable didn't show up.
So looking for experts in Packaging that might point me in the right direction: is a missing line in content_types fatal to opening the file, and if so, how can I force the pivotTable to appear?

Why does one need to compile two times to have a table of contents in the pdf?

I am working on a project on Latex but I don't understand why I should compile twice to insert a content table in a PDF document?
If you compile the first time, tex does not yet know which sections will come later in the document. During the first compilation, it collects all the names etc of the sections in the .toc file and in the second compilation, it can then use this information to build the table of contents.

Flutter How to see document and data and does removing imported packages reduce size?

APP document and data 3gb on IOS.
Just want to see what all comprises the document and data to be so huge.
Also if a widget has 15 imported packages, does removing the imported packages help with document and data size of app? Or does the majority of the size increase come from the package itself in the yaml file?
Hi you cmeasure the app size with this link, when you debug your app the weight is higher but when you compile the app the weight is less.
To check the size of the packages and more you can use the flag --analyze-size when you're going to build your app

Unconcatenating files

I have a corrupted 7-zip archive that I am extracting manually using the method outlined by Igor Pavlov at this link. An intermediate result is a large file that is a bunch of files cat'ed together that must be separated manually. I understand that some file formats will need to be extracted manually by a human using discretion (text files, etc.) but many file formats encode the size of the file as part of the file itself (e.g. .zip). Furthermore, some files can be parsed and their size can be deduced with just a little information about the file format (e.g. .pdf). Let's say the large file consists of the following files concatenated together:
Key: <filename>(<contents>)
badfile(aaaaaaaaaaabbbbbbbbbcccccccdddddddd) -> zip1.zip(aaaaaaaaaaa)
badfile2(bbbbbbbbbcccccccdddddddd)
I am looking for a program that I can run on a large file (call it badfile) that can determine the type and size of the first logical file (let's say it's a .zip file) contained within and create a new file to hold the contents (e.g. zip1.zip since filenames are lost) and chop the file off the front of badfile. This would allow me to run the program in a loop to extract files with known types and/or pause and let the user handle the difficult cases. Does such a program exist? I know that the *nix command file(1) will do a lot of the work here, but there would be a lot of effort in encoding rules for sizing files (e.g. .pdf) that I would prefer to not duplicate.
I believe this question should be closed due to being off topic as it asks to find existing programs to solve the problem, but open bounty prevents close vote. However.
Does such a program exist?
Yes they exist is and are called data carving tools.
Some commom ones include scalpel and foremost and PhotoRec
A list of other tools is avaliable here

Inkscape screws up EPS files

I have been trying to use Inkscape to prepare artwork graphics for my scientific papers. I use LaTeX, and I need my figures to be prepared as high-quality Encapsulated PostScript (EPS) images. The work order is as follows. First, I plot parts of my figure using matplotlib and save them in EPS format. Second, I launch Inkscape and import the EPS files. Using Inkscape I compose a figure, leaving needed objects, killing unneeded, and adding some markups. So I used to do when I worked with CorelDraw in Windows, but now I work in Linux.
Unfortunately, Inkscape damages EPS files: it changes the colors and does not save all the objects. Over last years I tried to search for a solution, but I cannot find that people complained. The complaints (found on the Web) are related to something like "incorrect font rendering" when exporting from svg to eps or back. (For me this is not a problem - the text always can be represented as curves).
I currently work in Mandriva Linux 2010 and use Inkscape version 0.47 r22583 (Jan 14 2010). Somewhere I read that such problems could be caused by some outdated versions of cairo - mine is 1.9.14. I spend a lot of effort to build newer cairo (1.12.14), but I am still far from the purpose. I got confused in 32 and 64-bit libraries coexisting in my system...
I would be very grateful to anyone who has similar problems and, may be, advanced further towards the solution. Let me illustrate the problem.
Sorry, I do not have enough reputation points to neither post images nor insert more than 2 links, so, please take a look at the copy of this post with the images in my livejournal page:
http://benkev.livejournal.com/1093.html
The figure captions are below.
(1) Here are the three eps images I would like to combine in one figure:
(2) Here is what I get after importing the images in Inkscape and saving in SVG format. Note color and resolution distortion. Also, I draw three red circles around the feature of interest.
(3) Here is what I get when I export this figure to EPS file. One can notice that one of the three red circles gone: only two circles left!
Thank you!
This appears to be a bug in inkscape. The following steps might help:
Open the svg file in inkscape.
Select all (Ctrl+A)
Un-group (Ctrl+Shift+G). you may need to repeat this step several times.
Save the result as eps format.
For what it's worth after more than one year: I've been experiencing the same problems with Inkscape V0.48: the EPS was missing items when opened in other software (e.g. Latex).
I didn't completely solve the problem, but I found that it helped to remove groups. Simply select all components and keep ungrouping until there are no groups left. Save as EPS and the result should be better.
If there are still items missing, try to use 'raise selection up to top' on the missing items and save again.
I know this is old, but the bug is still present in Inkscape so here's my two cents. My workaround is to save a copy of my project as "Plain svg". And export that as eps.
I hope it helps!
I created a new layer and moved the text which was not showing up in the EPS to this layer. Then it was showing up in the exported EPS file.
P.S. Make sure you make the new layer below the current layer and move text there.
It is a bug in inkscape (0.91 Window) but easy fix. Save directly into pdf from inkscape and then from pdf file save as to eps. Work like a charm for me.
A permanent solution for this problem is to export your *SVG to a *PNG and then export the *PNG (e. g. via the free Software GIMP) as an *EPS file type. The missing items are always included when I use this approach.

Resources