size limit on Jena/TDB archives - jena

We are working in a big project (with a lot of metadata) using Jena TDB. About a month ago a problem regarding memory appeared suddenly—the program has been working properly for months and no changes were done—and we are not able to upload any more data.
We’ve been working in this issue for several weeks, and we think that the problem is caused because of the size of some of our .dat files are larger than 16Gb. We’ve read that the index system used for TDB employs 64 bits for each index: 8 bits for the type + 44 bits for disk allocation + 12 bits for vnode. With 44 bits we can manage 16GB, and we think here is where the memory problem appears.
Could you please tell us if we are correct? If so, can you please tell us about the best solution?

Related

Jena tdbloader performance and limits

When trying to load a current Wikidata dump as documented in Get Your Own Copy of WikiData by following the procedure describe in https://muncca.com/2019/02/14/wikidata-import-in-apache-jena/ i am running into some performance problems and limits of Apache Jenas tdbloader commands.
There seem to be two versions of it:
tdbloader2
tdb2.tdbloader
The name tdbloader2 for the TDB1 tdbloader is confusing and led to
its usage as a first attempt.
The experience with TDB1/tdbloader2 was that the loading went quite well for the first few billion triples.
The speed was 150 k triples/second initially. It then fell to some 100k triples/second at around 9 billion triples. At 10 billion triples the speed dropped to 15000 triples/second at around 10 billion triples and stayed around 5000 triples/second when moving towards 11 billion triples.
I had expected the import to have finished by then so currently i am even doubting that the progress is counting triples but instead lines of turtle input which might not be the same since the input has some 15 billion lines but only some 11 billion triples are expected.
Since the import already ran for 3.5 days at this point i had to make a decision whether to abort it and look for better import options or simply wait for a while.
So i placed this question on stackoverflow. Based on AndyS's hint that there are two versions of tdbloader i aborted the TDB1 import after some 4.5 days and over 11 billion triples having been reported to be imported in the phase "data". The performance was down to 2.3 k triples/second at that point.
With the modified script using tdb2.tdbloader the import has been running again for multiple attempts as documented in the wiki. Two import tdb2.tdbloader attempts already failed with crashing Java VMs so I changed the hardware from my MacPro to the old linux box again (which is unfortunately slower) and later back again.
I changed the java virtual machine to a recent OpenJDK after the older Oracle JVM crashed in a first attempt with tdb2.tdbloader. This Java VM crashed with the same symptomps # Internal Error (safepoint.cpp:310), see e.g. https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8169477
For the attempts with tdb2.tdbloader I'll assume that 15.7 billion triples need to be imported (one per line of the turtle file). For a truthy dataset the number of triples would be some 13 billion triples.
If you look a the performance results shown in the wiki article you'll find that when there is a logarithmic performance degradation. For rotating disks the degradation is so bad that it makes the import take so long it's not worthwhile waiting for the result (we are talking multiple months here ...)
In the diagram below both axes have a logarithmig scale.
The x-axis shows the log of the total number of triples imported (up to 3 billion when the import was aborted)
The y-axis shows the log of the batch / avg sizes - the number of triples imported in a given time frame.
The more triples are imported the slower things get from a top 300.000 triples per second to as low as only 300 triples per second.
With the 4 th attempt the performance was some 1k triples/second after 11 days and some 20% of the data imported. This would mean an estimated time of finish of the import after 230 days - given the degradation of the speed probably quite a bit longer (more than a year).
The target database size was 320 GByte so hopefully the result would fit in the 4 TerraByte of disk space allocated for the target and is not the limiting factor.
Since Jonas Sourlier reported on his success after some 7 days with an SSD disk i finally asked my project lead for financing a 4 TB SSD disk and lend it to me for experiments. With that disk a fifth attempt was successful now for the truthy dataset some 5.2 billion triples where imported after about 4 1/2 days. The bad news is that this is exactly what i didn't want - i had hoped to solve the problem by software and configuration settings and not by throwing quicker and more costly hardware at the problem. Nevertheless here is the diagram for this import:
I intend to import the full 12 billion triples soon and for that it would still be good to know how to improve the speed with software / configuration settings or other non hardware approaches.
I did not tune the Java VM Args or split the files yet as mentioned Apache Users mailing list discussion of end of 2017
The current import speed is obviously inacceptable. On the other hand heavily investing in extra hardware is not an option due to a limited budget.
There are some questions that are not answered by the links you'll find in the wiki article mentioned above:
https://www.wikidata.org/wiki/Wikidata:Database_download#RDF_dumps
Failed to install wikidata-query-rdf / Blazegraph
Wikidata on local Blazegraph : Expected an RDF value here, found '' [line 1]
Wikidata import into virtuoso
Virtuoso System Requirements
https://muncca.com/2019/02/14/wikidata-import-in-apache-jena/
https://users.jena.apache.narkive.com/J1gsFHRk/tdb2-tdbloader-performance
What is proven to speed up the import without investing into extra hardware?
e.g. splitting the files, changing VM arguments, running multiple processes ...
What explains the decreasing speed at higher numbers of triples and how can this be avoided?
What successful multi-billion triple imports for Jena do you know of and what are the circumstances for these?

efficiently load and access torch tensor

I have a big binary file (almost 2GB) containing float32. I load it by
t = torch.FloatTensor(torch.FloatStorage(filename))
I will keep accessing this big tensor for 1 to 2 hours when executing my program. I observed that it's very slow for the first 10 to 20 minutes.
Can anyone explain why and provide some advice?
Thanks
its may be because
OS caches things in memory, in the background, without you having to care about it...
Preferred Handling of Large Files

How 32bit architecture is different from 64bit architecture mainly in form of app speed and memory management? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
As per my knowledge, the OS architecture is generally used to speed up our OS and adding new features with higher memory management. But, in iOS I am a little bit confused regarding architecture, which we generally set in our app as below:
Architectures - Standard Architecture (armv7,arm64) Valid Architectures - armv7,arm64,armv7s.
Due to this we are getting many warnings related to datatype sizes and conversion because 64-bit architecture is the use of processors that have datapath widths, integer size, and memory address widths of 64 bits.
So my question is: I want to understand what mechanism will work behind this while I am generating IPA file for 32 bit supported architecture or 64 bit architecture (I know now after XCode-6 we will only built our app with 64 bit architecture with Bitcode enabled in our app for reduce our app size.)
Can anyone help me on this to understand architecture mechanism specially in IOS?
you can think of it like, in 32 bit architecture cpu have 32 data transfer lines connected to ram to trasfer data where one line can send 1 bit at a time(per clock cycle that can address max pow(2,1)=2 byte address at a time ) which can be either 0 and 1 so in total 2 bits can be addressed with one line.so if you have 2 lines you can send any one pair of (0,0) ,(0,1),(1,0),(1,1) mean total 4 bytes which is basically the power of 2 so if u have 3 mean 8 bytes
hence if you have 32 you can send 2^32 (2 to the power of 32) = 4294967296, which is 4 GB so if you have 8 gb ram with 32 bit cpu system then your os can address 4 gb ram max at a time . one solution to this is break 64 bit address into two parts and store them in 32 bit registers (sonic speed storage part of cpu to calculate virtual memory addresses ) to access memory more than 4 gb .
so basically it's same on every cpu no matter it's a phone ,laptop or desktop .
although 64 bit system consume more power compared to 32 bit.
you can try these links and posts to study them thoroughly.

Data memory and Instructions on PIC18F4321

We are studying the PIC18F4321, and at some point my professor drew the following diagram on the board:
He made it look like instructions (such as ADDLW 0X02, MOVWF 0X24, etc) will take two addresses in data memory, because memory addresses in the PIC18F4321 only take a byte and instructions are 16 bits wide.
But in the datasheet of the PIC18F4321, I cannot find where it says that these 16 bits instructions will ever be stored in data memory. Before he said that, I had in mind that the data memory was for storing register values, not full instructions. On the other hand, I know that there is also program memory, but program memory it is not 8 bits wide, which makes his drawing even more confusing.
1) Are 16 bits instructions ever stored in Data Memory?
2) One way I found of trying to explain the picture is that perhaps the memory in question is not necessarily 8 bits wide, it is just that every address can only take 8 bits. So <8> would be simply stating how many bits you can hold in that address. Would this be a reasonable explanation?
1) Are 16 bits instructions ever stored in Data Memory?
No. Data memory is not used for storing instructions - you cannot execute any code from data memory. All instructions are stored in program memory, which consists of 16 bit instruction words. The datasheet details the format and layout of the different instructions. Some instructions are single word, some require multiple words. The program memory is addressed by a 21 bit program counter, which encompasses a 2Mbyte space although for the PIC18F4321 there is just 8Kbytes of program memory, which equates to 4096 single-word instructions.
Data memory consists of 8 bit bytes, addressed by a 12 bit bus, which allows up to 4096 bytes of data memory although the PIC18F4321 has just 512 bytes of data memory, split into two banks of 256 bytes. This data memory contains the SFR's (special function registers) and the general purpose registers (GPR) that you use in your application.
All of this is explained in greater detail in the datasheet for this device, specifically in Section 5.
The way that program memory is addressed by the program counter (PC) enforces the 16-bit instruction word alignment by forcing the least significant bit of the PC to zero, which forces access in multiples of two bytes. Quoting from the datasheet:
The PC addresses bytes in the program memory. To prevent the PC from
becoming misaligned with word instructions, the Least Significant bit
of PCL is fixed to a value of ‘0’. The PC increments by 2 to address
sequential instructions in the program memory.
I suggest that you thoroughly read Section 5 of the linked datasheet and see if you have any remaining doubts. It contains a lot of detail, but it is well described even though it will take more than one reading to understand it completely.

Determine how many memory slots are available in a specific computer?

Im studying assembly langauge on my own with a textbook, and I have a question that talks about the memory of a computer. It says the possible memory in a 32-bit PC is 4,294,967,296, which is 4GB. This is because the last memory location is FFFFFFFF base 16 (8 F's there). It also goes on to say that 2^10 is 1KB, 2^30 is 1GB etc. It also addresses 64-bit machines, saying 64 bit mode can internally store 64-bit addresses and "that at the time this book was written, processors use at most 48 bits of the possible 64". It goes on to say that this limitation is no match, because it could address up to 2^48 bytes of physical memory (256TB) which is 65,536 times the maximum in 32-bit systems. It also finally talks about RAM and how it basically provides an extension of memory. Okay okay, so I just wanted to tell you what my book has been telling me, and so it possesses a questions:
Suppose that you buy a 64-bit PC with 2 GB of RAM. What is the 16-hex-digit of the "last" byte of installed memory?
And I tried to tackle it by saying we know from the boook that 2^30 = 1GB and I said, 2^x = 2GB. I then knew that one physical address is one byte, so I converted 2GB to the respective amount of bytes. I then I took the log of base 2 of how many bytes I got to solve for x. I got 2^31 in the end, but that was a lot of work. I then converted it to hex giving me 80000000 base 16. And I was stumped then. I look at the answer in the back of the book and it says this:
2 * 3^20 = 2^31 = 80000000 base 16, so the last address is 000000007FFFFFFF.
how did the book get 3^20? and that doesnt even equal 2^31 when you times it all out by 2. How do you solve this problem.
In addition how does RAM correspond to memory, is it an extension of the physical memory? the book doesnt actually say that, just says its wiped from the computer every time the computer shuts off, etc. Could you give me more insight on this?
Thanks,
-Dan

Resources