Is there a way to get the language from Medium posts? - medium.com-publishing-api

I need to review numerous articles every day, however, I'm only interested in articles in a specific language (Portuguese, pt-br, in this case).
I even read Medium's API and didn't find language information, but I'm not a technical reference lol
I would like to:
Know if there is any way to get the language parameters of an article from Medium (Medium.com) to be able to return only articles in Portuguese
If not possible, recommendations on how I could collect published articles, and filter them by language using another technique or technology (e.g.: artificial intelligence)

Medium currently does not support a way to filter stories by language.
But you can change the app language. If your app doesn't load in your language, you can easily change that in your phone settings.
you can follow this link for further details
about language availability on medium

Related

Microsoft read aloud with bilingual(non-english) websites or pdf files

I see Microsoft read aloud feature is available in only one language at a time. That means you can select a language and ask edge to read a web page in that language. That is if your web page is written in multiple languages, for example a dictionary page(in this case there will be two languages in the same page), you can only make edge to read loud only one language but not both simultaneously.
Even though we select a regional language, I see read aloud picks english(of course only with local dialect) simultaneously with any language. That is you can read a web page which is written in English and another language, you can read words from both languages simultaneously without manually switching the language and voice.
But this feature is available only for any language in combination with English.
I would like to know if there is any way to read aloud two non-English languages simultaneously without manually switching the language or the voice.
Edited as per request from Community Bot
Microsoft Edge Read Aloud feature is powered by Text to Speech, whose doc has listed some of the languages that support cross-lingual feature (custom neural voice required). But apparently it has not been applied to Edge yet, since the feature needs to be purchased in Azure.

Parsing a XML string with no sense of keys

I'm using OpenUri and RSS in Rails 5.2.3 and Ruby 2.6.1 to do this.
I'm trying to parse WeWorkRemotely's RSS feed, however, they have one field description that contains all the information in a string. For example, when I parse it in Rails it returns:
"<img src=\"https://we-work-remotely.imgix.net/logos/0015/9022/logo.gif?ixlib=rails-2.1.3&w=50&h=50&dpr=2&fit=fill&auto=compress\" alt=\"Logo.gif?ixlib=rails 2.1\" />\n\n<p>\n <strong>Headquarters:</strong> San Francisco \n <br /><strong>URL:</strong> http://www.loom.com\n</p>\n\n<h1><strong>About Loom</strong></h1><div>Loom is a new kind of work communication tool, already helping over a million people get their message across through instantly shareable videos. Our users work at companies like HubSpot, Square, Uber, GrubHub and LinkedIn. Our mission is to be the global leader in human workplace communication. Founded in 2016, Loom has raised $15 million from top-tier investors including Kleiner Perkins, General Catalyst and Slack Fund.</div><h1><strong>The Role</strong></h1><div>As a Technical Support Engineer, you will be a key part of Loom's support experience at scale and provide timely and effective resolution to customer issues by applying your technical and troubleshooting skills.</div><div><br></div><div>We are looking for support champions who are genuinely happy to help others. If this sounds like you, you came to the right place!<br><br><strong>As a Technical Support Engineer, you  will…</strong>\n</div><ul>\n<li>Help customers through email to ensure they are successful with our product</li>\n<li>Leverage effective troubleshooting to quickly identify the source of customer issues and provide a prompt and appropriate solution</li>\n<li>Troubleshoot, investigate, and create detailed bug reports for our Engineering team</li>\n<li>Jump on ad-hoc calls with customers to troubleshoot issues live, as necessary</li>\n<li>Identify bugs, test, report, and working with our Engineering team to assist with a fix</li>\n<li>Actively collect insights from customers and focus on closing the communication loop by providing product feedback to the team</li>\n<li>Provide timely updates to the Support and Engineering Managers regarding new trends in issues</li>\n<li>Develop and document best practices to enhance SL2 troubleshooting processes</li>\n<li>Create technical documentation such as FAQs, guides, knowledge-base articles and how-to’s for Loom customers</li>\n<li>Help the Engineering team develop tools to help our Support team work quickly and efficiently</li>\n<li>Dive into the codebase and gaining domain knowledge of different parts of Loom</li>\n<li>Make efficient changes to the codebase to solve small and quick tasks/issues</li>\n</ul><div>\n<br><strong>You could be a good fit if you have..</strong>\n</div><ul>\n<li>Previous experience delivering excellent support experiences with respect, empathy and understanding</li>\n<li>A minimum of 4+ years of Technical Support and Customer Support experience</li>\n<li>Gained experience/proficiency in Saas solutions and electron apps (CSS, JavaScript, HTML) or have earned a degree in a technical field like computer science</li>\n<li>Technical understanding and ability to troubleshoot and resolve technical problems on your own</li>\n<li>The ability to handle high volume of support conversations</li>\n<li>Excellent written and spoken English</li>\n<li>Are available to work in the Central or Pacific Time Zone and on a full-time schedule that may span weekends and may include holidays as our customers need us</li>\n</ul><div>\n<br><strong>A bonus if you have experience with...</strong>\n</div><ul>\n<li>Installation, configuration, and troubleshooting of Windows and Mac</li>\n<li>Troubleshooting protocols like HTTP, HTTPS, WebSockets, DNS</li>\n<li>Understanding of TCP/IP and ARP to run packet traces and troubleshoot network issues</li>\n<li>Any of these certifications: Cisco CCNA, Microsoft Certified Solutions Expert (MCSE), Apple Certified System Administrator, CompTIAA+, CompTIA Network+</li>\n</ul><div><br></div><div><strong>Perks at Loom</strong></div><div><br></div><div>* Competitive compensation and equity package</div><div>* Medical, dental, and vision coverage (US-based team), healthcare reimbursement (non-US based team)</div><div>* Unlimited PTO</div><div>* Remote-first team</div><div>* Paid parental leave</div><div>* Yearly off-site retreats (this year we went to Costa Rica for a week!)</div><div>* Learning & Development reimbursement</div><div>* Wellness reimbursement</div><div> </div><div><strong>SF office perks</strong></div><div>* Remote weeks every other month</div><div>* Daily in-office lunch, unlimited snacks & drinks</div><div><br></div><div><strong>Remote-specific perks</strong></div><div>* Home office & technology stipends</div><div>* New Hire Onboarding in SF</div><div><br></div><div><strong>Loom is an equal opportunity employer.</strong></div><div>We are actively seeking to create a diverse work environment because teams are stronger with different perspectives and experiences.</div><div><br></div><div>We value a diverse workplace and encourage women, people of color, LGBTQIA individuals, people with disabilities, members of ethnic minorities, foreign-born residents, older members of society, and others from minority groups and diverse backgrounds to apply. We do not discriminate on the basis of race, gender, religion, color, national origin, sexual orientation, age, marital status, veteran status, or disability status. All employees and contractors of Loom are responsible for maintaining a work culture free from discrimination and harassment by treating others with kindness and respect.</div>\n\n<p><strong>To apply:</strong> https://jobs.lever.co/useloom/15398ec6-b2c1-4f95-9ef5-8fa2a62c1bed?lever-origin=applied&lever-source%5B%5D=WeWorkRemotely</p>\n"
What would be the best way for me to actually grab data from this block? Even if I try to pick things up like img src, head quarters, or a href links, it's a big string where I can't easily split that makes sense.
Don't treat it as a string, treat it as an HTML document. Then you can employ the full power of CSS or XPath selectors (or even manual traversal using Ruby methods).
require 'nokogiri'
doc = Nokogiri::HTML.fragment(str)
# img src
doc.at_css('img')["src"]
# => "https://we-work-remotely.imgix.net/logos/0015/9022/logo.gif?ixlib=rails-2.1.3&w=50&h=50&dpr=2&fit=fill&auto=compress"
# headquarters
doc.at_xpath('.//strong[contains(text(), "Headquarters")]/following-sibling::text()').text.strip
# => "San Francisco"

Default site language based on the ip with rails 3

I've a multilanguage site, based on the localization of the visitor (his ip), I'll use one language or another (setting the I18n.locale).
i.e: for a visitor from France, the default language will be french, for a visitor from US the default language will be english.
Which gem do you recommend for that, there is a wide range of choice.
Thanks
Some countries have more than one official language, or multiple commonly spoken languages. For these countries, you may want to consider defaulting to the most prevalent spoken and then display a ranked list by prevalance. You can get the official and common languages spoken and percentage from the CIA World Factbook listing on country languages:
https://www.cia.gov/library/publications/the-world-factbook/fields/2098.html
If you need to map the language to the ISO 639-1 alpha-2 or alpha-3 codes, you can get that information from the US Library of Congress at:
http://www.loc.gov/standards/iso639-2/php/English_list.php
The OpenGeoCode.Org Team
Andrew

Crowdsourcing translation for mobile developers?

I am developing applications for mobile phones with different operating systems (Android, Symbian, iPhone). Applications are sold internationally so they need to be translated to different languages in addition to english version.
I assume most mobile developers do the translations using some paid external service each time. This approach does not look very cost-effective to me. Would it make sense to have a website where simple translations would be done using crowdsourcing (other developers)? Most strings in mobile applications are very simple and short, for example "OK, "Cancel", "Are you sure?", "Please enter your password". Also the same strings are used in hundreds of applications. Instead of paying for translating all strings, developers could save money by only buying their difficult application specific translations.
Does anyone agree with this idea? I have seen many opensource projects doing the translations succesfully using volunteers.
I just found solution for me. Many users find this question in Google so I think my post must be helpful:
This is solution for us: crowdin.com - agile localization solution for tech companies
Microsoft allows you to view their terminology database: https://www.microsoft.com/Language/en-US/Default.aspx
That covers about 90 languages and will get you the things you mention such as common button captions, etc.
The problem you are facing after that is to try to get only the strings translated that you want. Most translators are going to charge you a minimum number of words. And they are going to want the entire resource file (regardless if you translated them yourself or not). Makes sense because localizing a product means that they need to have the whole picture to ensure consistency, etc. Professional translators will probably not charge you for what they call 100% matches.
I would never ever trust the translation of my product to crowd sourcing. Ever. You get what you pay for. Besides, just because you speak a language natively doesn't mean that you can write well, etc.
How do you check the crowd sourcing translation results for accuracy and quality? In a famous and documented occurrence recently the phrase "No lorries by this route please use the main road" was translated into "We are out of the office until Monday please contact us again then" and turned into road signs that were erected.
Crowd sourcing translation has been used and FaceBook is probably the largest company i know of that tried/used it. I have not tracked their progress but you could investigate it to see it's success or otherwise. Their method of quality checking was to get other people using the translations to vote for the one they preferred, so this was a case of crowd sourcing quality control. At this point the proposal that a camel is a horse designed by a committee jumps unbidden into my mind.
Translation, in spite of all the machine pumped into it, is still more of an art than a science. To translate correctly you need to have a native speaker translating from another language into their own. So for English to German you need a native German speaker who can speak English very well to do it. Within the profession very, very few translators will translate to a language in which they are non native. The reasons for this are many but boil down to the colloquial nature of language.
To be positive you could look at how Facebook fared and follow that route. Another route would be to approach not translators, but a translation agency, there are quite a number of these. Present them with the whole corpus you want translating in the original English and get them to quote you for the whole job. This would mean someone else manhging the job and the quality and they may have shortcuts, especially if the translations are to fairly standard "computerese" type phrases. i.e.'Home', 'Back', 'Next', 'Click here' etc.

Browser language: autodetect vs user select?

I am designing a localized web app. I am leaning on auto-detect browser language setting. But I notice a number of respectable sites asking the user to select a language. Is there any usability issue you know of (from actual experiences out there) with just auto-detecting user language?
Thanks.
Give me a choice
Remember my choice
Use the auto-detect as default
Make transition easy
In many situation I prefer or even need the "original" over my local one, bad translations or different content being the major reason.
If you register multiple domains, you can base your auto-detect on that: When foo.com redirects me to foo.de, or otherwise shows me a german interface, it is actively ignoring my choice to go to foo.com.
MSDN did insist on showing me atrocious automatic translations and ALWAYS made me click to go to the readable, understandable english one (that's a step up: when they introduced it, the default selection for changing the language was something like Afrikaans).
Make transition easy: i.e. make it easy to go to the counterpart of the current page in a different language. Amazon often succeeds when I change ".com" to ".de", but then it fails to lead me to the german translation of the item. That's not always possible, as that requires each local view having the same structure and a 1:1 page mapping. But generally, you have to weight above requirements against other constraints of the project.
[edit] MSDN got better now :)
I would suggest to autodetect the language and display the site in this language or the default languge (probably english) if the translation is not available. Additionally present the user with a selection of languages on top or bottom of your page. The names of the languages should be written in the target language.
Don't do it like that: English, German, Italian.
But: English, Deutsch, Italiano.
Obviously there is the usability problem that you might detect a language that the user doesn't understand. How are you going to do the detection? Don't think everybody has their browser set to the correct language. IP-Adresses are also a very bad indicator for the users language.
Practical example: YouTube tried to convince me for a week or so to use the Japanese version, though I can't read Japanese. Not very helpful. Microsoft is also determined to serve me automatically translated versions of there documentation when I just want to read the English one.
So don't try to tell your users which language they're supposed to prefer, let them decide for themselves.
I really hate non-configurable auto-detection because a lot of applications are translated more than imperfectly. I would rather read perfect English than bad Russian. For example, some terms do not translate in a reasonable way, and trying to translate everything makes localized version faintly ridiculous.
Also some applications can not translate new features fast enough, leading to a mixed language.
So I always prefer to have a choice, and choose the version that is native to the application author -- for the best language (unless it is a language I do not know).
Update:
One situation when it has gone beyond ridiculous is DB2 (or its client tools, not sure), which forced me to install a Russian version, but all errors in this version were shown as "???????? ??? ??? ??".
Yes: at work, we have a Windows XP deployed with 'English' language (because we have worldwide site and only one kind Windows to deploy with only one kind of settings when it comes to language).
Yet all out applications must run in French. The auto-detect feature alone would not be enough for an appropriate display of the labels.
Sometimes when you are trying to describe something to a user over the phone and you are in a different location, it is very annoying when you are both looking at the same URL, but see different results. You might even go so far as to include the language in the URL similar to how wikipedia does it (e.g. en.wikipedia.org).
Also sometimes a user will be on a friend's computer and try to access a website but won't see it in their preferred language, because of the language settings on the computer.
I think the best solution would be to allow the user to override the setting, but default it to the auto-detected language.
I agree that the auto-detect is not enough.
Not many users know the settings for selecting their language. Therefore the settings will often be the default and therefore incorrect (for non-english users).

Resources