Blog
All Blog Posts | Next Post | Previous Post
Delphi and iPhone helping vision impaired people
Friday, September 30, 2016
We released earlier this week a major update of the TMS FMX Cloud Pack. This new version adds a lot of new components covering seamless access to all kinds of interesting cloud services. Among the new services covered, two services from Microsoft stand out and open up new ways to enrich our Delphi applications with cool features. In this blog, I wanted to present the Microsoft Computer Vision and Microsoft Bing speech service. Our new components TTMSFMXCloudMSComputerVision and TTMSFMXCloudMSBingSpeech offer instant and dead-easy access to these services. Powered with these components, the idea came up to create a small iPhone app that let's vision impaired people take a picture of their environment or a document and have the Microsoft services analyze the picture taken and let Microsoft Bing speech read the result.So, roll up your sleeves and in 15 minutes you can assemble this cool iPhone app powered with Delphi 10.1 Berlin and the TMS FMX Cloud Pack!
To get started, the code is added to allow taking pictures from the iPhone. This is a snippet of code that comes right from the Delphi docs. From a button's OnClick event, the camera is started:
if TPlatformServices.Current.SupportsPlatformService(IFMXCameraService, Service) then begin Params.Editable := True; // Specifies whether to save a picture to device Photo Library Params.NeedSaveToAlbum := false; Params.RequiredResolution := TSize.Create(640, 640); Params.OnDidFinishTaking := DoDidFinish; Service.TakePhoto(Button1, Params); end
procedure TForm1.FormShow(Sender: TObject); begin TMSFMXCloudMSBingSpeech1.App.Key := MSBingSpeechAppkey; TMSFMXCLoudMSComputerVision1.App.Key := MSComputerVisionAppkey; end;

So, a TTask is used to start this analysis with the call TMSFMXCLoudMSComputerVision1.ProcessFile(s, cv). A TTask is used to avoid that the UI is locked during this analysis, after-all, the image must be submitted to Microsoft, processed and the result returned and parsed, so this can take 1 or 2 seconds. Depending on the analysis type, the result is captured as text in a memo control. After this, we connect to the Bing speech service.
procedure TForm1.DoDidFinish(Image: TBitmap); var aTask: ITask; s: string; cv: TMSComputerVisionType; begin CaptureImage.Bitmap.Assign(Image); // take local copy of the file for processing s := TPath.GetDocumentsPath + PathDelim + 'photo.jpg'; Image.SaveToFile(s); // asynchronously start image analysis aTask := TTask.Create (procedure () var i: integer; begin if btnAn0.IsChecked then cv := ctAnalysis; if btnAn1.IsChecked then cv := ctOCR; if TMSFMXCLoudMSComputerVision1.ProcessFile(s, cv) then begin Description := ''; if cv = ctAnalysis then begin // concatenate the image description returned from Microsoft Computer Vision API for I := 0 to TMSFMXCLoudMSComputerVision1.Analysis.Descriptions.Count - 1 do begin Description := Description + TMSFMXCLoudMSComputerVision1.Analysis.Descriptions[I] + #13#10; end; end else begin Description := TMSFMXCLoudMSComputerVision1.OCR.Text.Text; end; // update UI in main UI thread TThread.Queue(TThread.CurrentThread, procedure () begin if Assigned(AnalysisResult) then AnalysisResult.Lines.Text := Description; end ); TMSFMXCloudMSBingSpeech1.Connect; end else begin // update UI in main UI thread TThread.Queue(TThread.CurrentThread, procedure () begin AnalysisResult.Lines.Add('Sorry, could not process image.'); end ); end; end ); aTask.Start; end;
procedure TForm1.TMSFMXCloudMSBingSpeech1Connected(Sender: TObject); var st: TMemoryStream; s: string; begin st := TMemoryStream.Create; s := AnalysisResult.Lines.Text; try TMSFMXCloudMSBingSpeech1.Synthesize(s, st); TMSFMXCloudMSBingSpeech1.PlaySound(st); finally st.Free; end; end;
Now, let's try out the app in the real world. Here are a few examples we tested.


Using the app on the road to read road signs and capture car license plates

Trying to figure out what we see in a showroom


First the app analyzed correctly this is a bottle of wine in the cellar and is then pretty good at reading the wine bottle label.
You can download the full source code of the app here and have fun discovering these new capabilities.
Enjoy!
Bruno Fierens

This blog post has received 6 comments.


2) We tested this sample for iPhone but I expect this to work for Android as well. I don''t see much reasons it wouldn''t work for Android in fact, but I haven''t tested it here so far on Android.
Bruno Fierens

1) yes sure!
2) same code not works
Antonello
Carlomagno Antonello


We got meanwhile confirmation from other users the code works also on Android, so, there must be something specific when your IDE / Android then I guess.
Bruno Fierens

rekan.jalal19

Stoyan
All Blog Posts | Next Post | Previous Post
I have 2 questions:
1 - The bing speech return a "access denied" error
2 - Why only iphone works?
thanks
antonello
Carlomagno Antonello