How do I connect my database to API.AI? - machine-learning

How do I connect my database to API.AI
Making every sentence into INTENT and creating entities for each doesn't seem to be a good idea? So what is the best possible way to go about?

As far as I know it is not possible yet, but you can switch to row mode and past your entities inCVS or JSON format OR import a JSON/CSV file containing all your entities.
The file should look like below (JSON format):
[
{
"value": "val1",
"synonyms": [
"syn1",
"syn2"
]
},
{
"value": "val2",
"synonyms": [
"syn21",
"syn22"
]
},
]
So you can image of writing a small job that reads entities from you DB and make a JSON/CSV file according the wanted format.
Once the job done, this process may dramatically facilitate the creation of your entities on api.ai.

If you use a webhook for an intent, you can pass params to your endpoint where you can do all the queries to your db
I did a demo where I was querying news (cheating as I was getting it from the web, but I could plug a DB).
The was getting requests such as:
"What are the latest news about France"
latest and France would be params that I send through to the webhook endpoint.
You would get the following JSON sent your endpoint by API.AI
"result": {
"source": "agent",
"resolvedQuery": "latest news about France",
"action": "show.news",
"actionIncomplete": false,
"parameters": {
"adjective": "latest",
"subject": "France"
}
Then you can query all the news for France and order them by latest
In my understanding the idea is to create entities that are "placeholders" for the values you need to query.
Then you teach the AI with few examples by tagging in the request what did the person ask. Let say someone asks:
"what is the oldest news about France?"
The AI may not know what is oldest thus you tell it is is an adjective and from now on you can get oldest as a param

Related

Album mbId in track's metadata

I am using MusicBrainz to get a track's meta data. I want to get the track's album's mbid. I am doing the following lookup using ISRC code.
https://musicbrainz.org/ws/2/isrc/USRC11600201?fmt=json
But in response I don't get any metadata related to the album of the track. I get the following response:
{
"isrc": "USRC11600201",
"recordings": [
{
"disambiguation": "single remix",
"title": "Cheap Thrills",
"id": "92e27a47-3546-4bc2-a9f7-b19e43d7a531",
"length": 223000,
"video": false
},
{
"length": 218540,
"video": false,
"title": "Cheap Thrills",
"disambiguation": "",
"id": "5845e975-33b4-4b0d-8e74-8f57d128b3d1"
}
]
}
I have tried various combinations of the "inc" sub query parameter as well but nothing works. Please help me out. I am really stuck at this.
Using inc=releases in the URL parameters should be enough to get you the information that you want. However, it seems like there's a bug with MusicBrainz's JSON web service (which is still officially in beta), as you can see in the difference between the XML and JSON end points' output:
https://musicbrainz.org/ws/2/isrc/USRC11600201?inc=releases&fmt=json (JSON) vs. https://musicbrainz.org/ws/2/isrc/USRC11600201?inc=releases (XML).
One obvious solution/work-around here would be to switch to using the more mature XML endpoint. If that is not an option, you can use the Recording MBIDs given in the JSON output to look up releases associated with those Recordings, e.g., https://musicbrainz.org/ws/2/recording/5845e975-33b4-4b0d-8e74-8f57d128b3d1?inc=releases&fmt=json (note that inc=releases is also needed here to get the information about the releases, and it actually works when looking up recordings).
So to get the details of the album of a track when I have the ISRC of the track, we need to do the following get request:
https://musicbrainz.org/ws/2/isrc/GBUM71604605?inc=releases
It will give a response in xml. The xml api is more stable click for more details on this
As I need the response in json, we can use a library like xml to json and vice-versa library
As much as I have seen the xml response from the MusicBrainz api is more accurate and gives a lot of information.

Strategies for sending weekly newsletter emails from Rails

So we have about 50,000 users who have signed up for a weekly newsletter. The contents of this email is personalized for each user though, it's not a mass email.
We are using Rails 4 and Mandrill.
Right now we're taking about 12 hours every time we want to fire off this emails.rake task and I'm looking for a way to distribute that time or make it shorter.
What are some techniques I can use to improve this time that is only growing longer the more people sign up?
I was thinking of using mandrill templates, and just sending the json object to mandrill and have them send the email from their end, but I'm not really sure if this is even going to help improve speeds.
At the 50,000+ level: How do I keep email sending times manageable?
Looks like you could use MailyHerald. It is a Rails gem for managing application emails. It sends personalized emails in the background using Sidekiq worker threads which should help you out in terms of performance.
MailyHerald has a nice Web UI and works with email services like Amazon SES or Mandrill.
You need to probably look into Merge Tags on Mandrill. It allows you to define custom content per email. So you can break your newsletter sending into fewer API calls to Mandrill instead of 1 per email. The number of calls will just depend on the size of your data since I am sure there is probably a limit.
You can just create a template and put in merge vars such as *|custom_content_placeholder|** wherever you need user specific content to be placed. You can do this templating in your system and just pass it into the message or you can set it up in Mandrill and make a call to that template.
When you make the Mandrill API call to send an email or email template you just attach the JSON data such as:
"message": {
"global_merge_vars": [
{
"name": "global_placeholder",
"content": "Content to replace for all emails"
}
],
"merge_vars": [
{
"rcpt": "user#domain.com",
"vars": [
{
"name": "custom_content_placeholder",
"content": "User specific content"
},
{
"name": "custom_content_placeholder2",
"content": "More user specific content"
}
]
},
{
"rcpt": "user2#domain.com",
"vars": [
{
"name": "custom_content_placeholder",
"content": "User2 specific content"
},
{
"name": "custom_content_placeholder2",
"content": "More user2 specific content"
}
]
}
],
You can find more info on Merge Tags here:
https://mandrill.zendesk.com/hc/en-us/articles/205582487-How-to-Use-Merge-Tags-to-Add-Dynamic-Content
If you are familiar with handlebars for templating, Mandrill now supports it with the merge tags:
http://blog.mandrill.com/handlebars-for-templates-and-dynamic-content.html

Importing web service with Core Data or SQLite or something else?

I'm creating an Events app which needs to pull data from a JSON web service to get information about the artists and the shows that are being played. The data will be used to display the line up of artists (a to z) on one view, artists by date and time on another view, and artists by location and sorted by date/time on a third view. We will also allow the user to add shows to their schedule.
The JSON data is similar to this:
Artists feed:
[
{
"artists": {
"3": {
"id": "3",
"title": "Kendrick Lamar",
"subtitle": null,
"imageURL":
"//goevent-images.s3.amazonaws.com/.../web/artist_3_20140331112744_d57b5a70.jpg",
"gcInfo": "artist$kendrick-lamar/3",
"shows": [ {
"id": 153,
"venueTitle": "Sapporo scene",
"formattedDate": "Sunday, August 31",
"date": "2014-08-31",
"title": "Kendrick Lamar"
}
],
"tags": ",8,159,164,",
"color": "#00a0a0",
"dates": [
"2014-08-31"
]
},... },
]
Shows feed:
[
{
"items": {
"197": {
"id": 197,
"title": "Arcade Fire",
"type": "artist",
"dateStart": "2014-08-30",
"timeStart": "16:00:00",
"formattedTimeStart": " 4:00 PM",
"gcInfo": "artist$arcade-fire/127",
"venueId": "1",
"tags": ",80,",
"color": "#337FC3"
}
}
]
Shows and artists will have a many to many relationship. I'll also need to create an entity/table for storing the user's shows that will be added to their personal schedule.
Bands/shows do have the possibility of being removed from the feed, so I think I'll likely need to clear out the artists and shows entities/tables before importing. I'm worried this will break the relationship to the user's scheduled shows.
I also need to download as much of the high-level data as possible upfront so that the app can be used offline as well.
So my question is:
What's the best approach to importing and storing the data for this?
“Best” is subjective. As I understand it, Core Data uses SQLite under the covers, so it's really more a matter of what you're comfortable with.
Have you used Core Data before? If so, use that.
Have you used an SQL Database before? if so, use SQLite.
If you're starting from square one, I supposed I'd recommend Core Data.
Here are a few links to get you started:
Data Management in iOS by Apple
Core Data Programming Guide from Apple.
Core Data Tutorial for iOS: Getting Started
How to Use SQLite to Manage Data in iOS Apps
SQLite Tutorial for iOS: Creating and Scripting

In Mandrill's API, is there any way to know the size limits of JSON attributes in their Responses?

I'm looking at the Mandrill API documentation e.g. https://mandrillapp.com/api/docs/messages.JSON.html#method=info. In the JSON response there may be seomthing like the following
{
"ts": 1365190001,
"url": "http://www.example.com",
"ip": "55.55.55.55",
"location": "Georgia, US",
"ua": "Linux/Ubuntu/Chrome/Chrome 28.0.1500.53"
}
Is there are way to know how large the String's may be for these attributes? i.e. how many characters might be returned for "ua" or "url" etc... I'm asking as I need to capture and store some of this data in Oracle, but I don't want to LOB everything!

Restkit: How to get and map data from multiple source

I'm currently working on iOS Application with RestKit 0.20 to access data from Tastypie API.
And I am trying to get feeds data from URL like this
/api/v2/feed/?format=json
Then I will get array of feeds as below.
{
"meta": {
"limit": 20,
"next": null,
"offset": 0,
"previous": null,
"total_count": 2
},
"objects": [
{
"id": 1,
"info": "This is my first post",
"pub_date": "2013-02-03T15:59:33.311000",
"user": "/api/v2/user/1/",
"resource_uri": "/api/v2/feed/1/"
},
{
"id": 2,
"info": "second post, yeah",
"pub_date": "2013-02-03T16:00:09.350000",
"user": "/api/v2/user/1/",
"resource_uri": "/api/v2/feed/2/"
}
]
}
if I want to fetch more data about user which Tastypie send it as url like a foreign key "user": "/api/v2/user/1/", do I have to nested call objectRequestOperation.
I'm confusing because I'm using block to callback when data is successful loaded. So is there any better way than requesting user data again for each feed after requesting feed complete.
Thank you very much :)
You have to define in the Feed resource :
user = fields.ToOneField(UserResource, full=True)
More info in the tastypie doc http://django-tastypie.readthedocs.org/en/latest/resources.html
Why Resource URIs?
Resource URIs play a heavy role in how Tastypie delivers data. This can seem very different from other solutions which simply inline related data. Though Tastypie can inline data like that (using full=True on the field with the relation), the default is to provide URIs.
URIs are useful because it results in smaller payloads, letting you fetch only the data that is important to you. You can imagine an instance where an object has thousands of related items that you may not be interested in.
URIs are also very cache-able, because the data at each endpoint is less likely to frequently change.
And URIs encourage proper use of each endpoint to display the data that endpoint covers.
Ideology aside, you should use whatever suits you. If you prefer fewer requests & fewer endpoints, use of full=True is available, but be aware of the consequences of each approach.

Resources