Commercial API Q: is there an api for converting vision tags into a caption?


There are many machine learning api for scanning images but they just return a bunch of tags.

{ "tags": [ "train", "platform", "station", "building", "indoor", "subway", "track", "walking", "waiting", "pulling", "board", "people", "man", "luggage", "standing", "holding", "large", "woman", "yellow", "suitcase" ],  "confidence": 0.833099365 } ] }

Are there any apis for combining these into a sentence? MS Cognitive Vision is the only one that produces a full caption

"captions": [ { "text": "people waiting at a train station",

Google sentiment analysis can split a sentence into grammar parts but is there any api that does the reverse?

INPUT: "train", "platform", "station", "building", "indoor", "subway", "track", "walking", "waiting", "pulling", "board", "people", "man", "luggage", "standing", "holding", "large", "woman", "yellow", "suitcase"

OUTPUT: "people waiting at a train station"


2thanks for down votes without explanation. another reason I dont support it any more – brian.clear – 2018-03-22T15:07:20.303

2That's what people do on this specific Stack Exchange. I got same done to my question. :/ Did you look at chat bots? May be you can adapt one of them to glue words into sentences? – Alexus – 2018-03-22T18:37:05.227

1Sorry about the downvote welcome. It happens. Welcome to AI, regardless! – DukeZhou – 2018-03-22T19:36:44.760

1Alexus and brian.clear, I gave you guys some votes. I dislike the number of people down voting without explaining. – Brian O'Donnell – 2018-03-24T03:15:05.703

im afraid is the reason I use Stackoverflow in read only mode. this was mentioned on other screen In regards to how MS does it, for their Cognitive Services APIs they build their models from CNTK.

– brian.clear – 2018-03-24T18:36:33.990

