How do you create online Audio Converting website? - encode

My question is not specific to the title as such. I am not able to clearly express what I want to do in the title, so it's a generalization of the ideas.
I want to run a website that gathers data from user. Say, something like Audio file, e.g. Hello.mp3.
I would like to convert this Mp3 file into Wav by using a software on the PC, like Any Audio converter and after the conversion, I wish to upload it back to the webserver so that at frontend the user will be able to download the song easily.
Well my purpose is a little different from what I wish to say. At school we have some astronomical software, and our professor says that they are proprietary for our school. And I wish the other users to upload their data in form of Excel files that our Astro software may convert on the backend and process the information and back again upload the processed information to the Web user.
How is it possible and what are the things that I need to study?
Also let me know the technical prerequisites for such programming tasks.

Related

How to convert a human-readable timeline to table using existing ML tools?

I have this timeline from a newspaper produced by my Native American tribe. I was trying to use AWS Textract to produce some kind of table from this. AWS Textract does not recognize any tables in this. So I don't think that will work (perhaps more can happen there if I pay, but it doesn't say so).
Ultimately, I am trying to sift through all the archived newspapers and download all the timelines for all of our election cycles (both "general" and "special advisory") to find number of days between each item in timeline.
Since this is all in the public domain, I see no reason I can't paste a picture of the table here. I will include the download URL for the document as well.
Download URL: Download
I started off by using Foxit Reader on individual documents to find the timelines on Windows.
Then I used a tool 'ocrmypdf' on ubuntu to ensure all these documents are searchable (ocrmypdf --skip-text Notice_of_Special_Election_2023.pdf.pdf ./output/Notice_of_Special_Election_2023.pdf).
Then I just so happened to see an ad for AWS Textract this morning in my Google Newsfeed. Saw how powerful it is. But when I tried it, it didn't actually find these human-readable timelines.
I'm hopefully wondering if any ML tools or even other solutions exist for this type of problem.
I am namely trying to keep my tech knack up to par. I was sick the last two years and this is a fun problem to tackle that I think is pretty fringe.

Offline mode on iOS through Core Data

I have an application which has video content (something like youtube)
I want to make an offline mode in my app (no internet connection mode)
The end user will download video sources
Video sources will be saved on the device
When the app will be opened on offline mode the end user will get the offline video content
I will be happy to hear what is the best way to implement this kind of logic, I heard about Core Data is there some source code or an example you know about?
Your question is very generic. You should provide some details about your goal.
Core Data in this case could be a lot of work. Especially for its learning curve.
A simple solution could be to save videos on file systems and use plist files (or whatever you want) to contains the path (meta-informations) where videos are stored in.
Anyway, if you want to adopt Core Data, this will contain only meta-informations about videos. And not videos. As before they will be saved on disk. Maybe external storage functionality could be the right way to follow.

How to add DRM on epub programmatically?

I'm searching for a method to add DRM on ePub files programmatically. Anyone know how to do that? Maybe 3rd party software?
I added DRM with following things:
if ePub is coming from server, make zip file password protected and inside HTML pages you can encrypt via AES-128. For images also you can encrypt but you need to add more code on your reader part.
if you are encrypting images, then all images must be decrypted before you load HTML page in web or browser.
If you just want to protect your books from copying, then publish them through Amazon or Apple. As long as you don't select otherwise, those bookstores will wrap the books in DRM that allows them to be read only a limited number of devices belonging to the purchaser.
If you have some reason for wanting to worry about DRM yourself, then after carefully considering why on earth you'd want to, you'll need to find a vendor who can provide both the DRM technology and the reader (perhaps white-labeled for you) which knows how to read those DRM'd books. You see, DRM is useless unless there's a reader that can read the DRM'd books. And what's more, you need a back-end infrastructure to keep track of which devices belong to which person. There are vendors who provide such solutions. However, you'll end up paying them some of the money you were trying to save by avoiding Amazon or Apple.
The pricey Adobe solution mentioned by one commenter has the advantage that it is used by multiple bookstores/reading systems, including Kobo and Sony, so if you use it, then people buying your books can read them on any of these devices--albeit with an annoying step involving some software called ADE.
If for some strange reason you are thinking of building this entire infrastructure yourself, all I can say is, good luck.
More generally, even if you work through Amazon or Apple, it's well worth stepping back and thinking if you really want to do DRM or not. It's a natural human instinct to think, "By golly, I'm not going to let anyone steal MY book!!", but many of the people that might pirate a non-DRM'd book would not have paid money in the first place, so it's hard to say you're actually "losing" money. And someone who pirated the book might then Tweet or tell their friends about it and you'll end up selling more books than otherwise. Finally, someone who really wants the book without paying will crack the DRM anyway, as another commenter noted.

Question about uploaded files in ruby

When uploading a file I know I can access its properties but is it always the same or it varies? I mean, I am writing an app for myself where I can upload songs or videos to my server to watch later, and I'd like to populate the info about said files automatically as much as possible so I was wondering if it's possible to get things like length, quality, name, artists, artwork, or pick a first image like youtube does for its videos?
I'm fairly new to ruby (using rails) so I am unsure as to where to find this or if it's even possible
You can do that using FFMpeg (read the license first).
FFMpeg gives you everything you were asking about and some more.
it's very powerful.
For mp3, check out mp3-info, I haven't used it before but looks promising...

Using Ruby And Ubuntu With Optical Character Recognition

I am a university student and it's time to buy textbooks again. This quarter there are over 20 books I need for classes. Normally this wouldn't be such a big deal, as I would just copy and paste the ISBNs into Amazon. The ISBNs, however, are converted into an image on my school's book site. All I want to do is get the ISBNs into a string so I don't have to type each one by hand. I have used GOCR to convert the images into text, but I want to use it with a Ruby script so I can automate the process and do the same for my classmates.
I can navigate to the site. How can I save the image to a file on my computer (running UBUNTU), convert the image with GOCR, and finally save it to a file so I can then access them again with my Ruby script?
GOCR seems to be a good choice at first, but from what I can tell from my own "research", quality isn't quite sufficient for daily use. Maybe this could lead to a problem, depending on the image input. If it doesn't work out for you, try the "new" feature of Google Docs, which allows you to upload images for OCR. You can then retrieve the results using some google api ( there are tons out there, I'm using gdata-ruby-util which requires some hacking, though.
You could also use tesseract-ocr for the OCR part, it's also open source and in active development.
For the retrieval part, I would as well stick with hpricot, super-powerful and flexible.
Sounds like a cool project, and shouldn't be too hard if the ISBN images are stored in individual files.
This all can be run in the background:
download web page (net/http)
save metadata + image file for each book (paperclip)
run GOCR on all the images
All you need is a list of urls or a crawler (mechanize) and then you probably need to spend a few minutes writing a parser (see joe's post) for the university html pages.

Resources