The motivation for this blog post is to have some fun and demonstrate the agility with which you can develop mobile applications when you combine the advantages of serverless technology and the Dropsource platform. You should be able to reproduce what I’ve done in an hour by following this post.
If you’re a fan of the HBO show Silicon Valley, you may remember an episode in which Jian Yang builds an image recognition app with machine learning, which turns out only able to recognize photos of hot dogs. I wanted to do the same, but take it a bit further and recognize almost anything!
I’ve done this by taking advantage of Amazon Rekognition (in coordination with some other services), as well as a simple REST API on the Serverless framework that acts as the web service for the iOS app I built on Dropsource.
I won’t keep you waiting, you can scan the following QR code to install (sideload) the app on your iOS device. Open and point your iPhone camera at the code, then tap the “Open in iTunes” drop-down that appears. If this is your first time testing a Dropsource-built app, you’ll need to trust our developer certificate by following these simple instructions
Building your own app for free in Dropsource and AWS
If you’d like to learn how to build your own app, follow the instructions below. It should only take you an hour to have a fully functioning app on your phone without having to code a single line of Swift code! You can take advantage of your Dropsource free trial and the AWS free introductory tier to host your image recognition backend and API which I’ve provided for you below.
Deploy the Rekognition Stack on AWS
The first thing we’re going to get running are the services that process the images we upload from the app and give us the tags describing them. In short, this set of services will detect when something is uploaded to an S3 bucket, pass that through the Rekognition analysis, and then save the results in a DynamoDB table. This is done with AWS Step Functions that coordinate several Lambda functions to invoke various services.
For a visual representation, see the diagram below.
Getting all this up and running turns out to be a lot more simple than you’d imagine – thanks to the lambda-refarch-imagerecognition project AWS Labs has made available on Github.
Simply scroll down to the readme, and hit the ‘Launch Stack’ button, you’ll obviously need an AWS account to get started, but that’s easy and free to use. Follow the three steps to get it running in the us-west-2 region, and we’re done with that set of documentation. Once that executes, navigate to the ‘Outputs’ section of the stack we just launched in the CloudFormation console. We need to make note of a few values that are provided here, namely:
Leave the tab open for reference when setting up the next piece of our application backend, the REST API.
Create and Deploy The REST API
Dropsource works really well with REST APIs documented with Swagger (OpenAPI Spec), so we’re going to quickly create one for our app to interact with the image recognition stack we just deployed. It will serve two major functions:
- Upload images from our mobile device to the S3 bucket and providing us with the object key for referencing later.
- Querying the DynamoDB table to fetch the results posted by the Rekognition service, which are an array of image tags we’ll use to determine the content of the image.
Serverless Framework is my favorite tool for this task. It’s super quick to get an API running on a serverless AWS Lambda stack (or a variety of other cloud providers), you can code in one of many languages, and you can take advantage of API Gateway to export a Swagger file you’ll need for importing your API directly into Dropsource.
This post won’t get in too deep with how to write services with Serverless, there’s plenty of great documentation already out there for that. For now, you can just follow the documentation on the Serverless site to quickly install and configure the framework, and then launch the REST API I’ve already written for you, which you can find here.
Deploying the REST API using Serverless is painless, but there are a couple items we need to configure before doing that. Open the serverless.yml file in the project I’ve provided as well as the CloudFormation tab with the values I mentioned above.
Near the top of the serverless.yml file, you’ll see the ‘custom’ section with two values. Copy the value for ‘S3PhotoRepoBucket’ from the CloudFormation console to the ‘bucket’ property, and do the same with the ‘DDBImageMetadataTable’ value to the ‘dynamoDbTable’ property in your serverless.yml.
The last value in your serverless.yml file you need to configure, we’ll need to grab from the DynamoDB console in AWS. Open the DynamoDB console, click ‘Tables’ on the left, click on photo-rekognition-demo-app-backend-ImageMetadataDDBTable and you should see the ‘Overview’ tab for the table, scroll to the bottom of that tab and copy the value of the Amazon Resource Name (ARN) property. It should look something like… arn:aws:dynamodb:us-west-2:000000000000:table/photo-rekognition-demo-app-backend-ImageMetadataDDBTable-ABC1ABCABCAB
Now go back to your serverless.yml file and paste that value under provider->iamRoleStatements->Resource (the second statement, leave the first statement for S3 Resource as is). You’ll notice it’s really just the AWS region and your account number prepended to the table name we just configured.
While you have the DynamoDB console open, navigate to the ‘Capacity’ tab of each of the two tables shown. You can safely set all four of the capacity values to 1 while you’re building and testing your app; anything more is unnecessary if you’re the only one using the services.
Just to be safe, scroll down in your serverless.yml file and find the ‘provider’ section. Make sure that the ‘region’ property matches the ‘Region’ in your CloudFormation console. If it doesn’t, set the value in your yml to match the one in the console.
You can now switch back to your terminal where you installed Serverless. First, install the dependencies our project needs with the ‘npm install’ command, then run the ‘serverless deploy’ command in the project folder to deploy your REST API to AWS (make sure you’ve run the commands to install Serverless before you do this). I’ve taken the time to document our API endpoints (you may have noticed that in the serverless.yml) file, so once it’s deployed we can get a swagger .json file from the API Gateway console in AWS. To do that, navigate to the service in AWS, click on the ‘dev-aws-rekognition-mobile-app-service‘ on the left, then ‘Stages’, then ‘dev’ to the right, open the ‘Export’ tab, and finally hover over the ‘Export as Swagger’ icon to click ‘JSON’ which will start downloading the file.
Building the App in Dropsource
Elements and Styling
If you don’t have a Dropsource account yet, you can easily sign up for a free account by clicking here. After creating your account, create a new project (we’ll be using iOS, but you can do the same thing in Android with Dropsource).
Open the editor by clicking on the project. If you’ve just signed up and created your first project you’ll already be there.
Create a page by clicking on ‘Pages’ to the left and pressing the plus icon. Don’t check either checkbox, click next, and name the page whatever you’d like.
Next, click on ‘Elements’ to the left and drag out an ‘Image’ element onto the page, you can search for it in the search box if you’re having trouble finding it. Once you’ve done that, you can style it in the right bar to match the settings depicted below.
Next, click ‘constraints’ (these define the size and positioning of the element on the page) at the top of the right bar and set the constraints for Image 1 to match these (we want the image to be full screen regardless of device screen size):
Next, drag out a button element and place centered near the bottom of screen – this will be our camera button. I chose to make this an image-button for purely aesthetic reasons. To do the same, style the button by choosing an image to put on the button.
You’ll have to upload a new one and select it. Here’s the image I used:
After you’ve done that, scroll down to ‘Background Color’ for the button and set it to rgba(255, 255, 255, 0.00) this will remove the white background from the image so we’re just left with a yellow circle.
Next, we’ll create two labels that will show/hide depending on what tags were detected in the image. I’ve gone with two because my intention was to build the satirical ‘Not Hotdog’ app from the ‘Silicon Valley’ HBO show. You could easily exercise some design freedom here and choose to show only the image tags returned.
We’ll set up the first label and copy it to create the second label, so drag out a label element and drop it on the screen. Set the text to ‘HOTDOG!’, then style it just like the images below. (Note that clicking ‘Hidden’ will hide the label from the page. Do this last so that you can see what you’re working on. We’ll eventually set up logic to show it based on the image tags returned.)
Also set the constraints to match this:
Now, click the label on the screen and right click it to copy, right click the page canvas again and paste it to create an identical label.
Important note: it may be difficult to select the right element when they’re on top of each other. To make this easier you can click on the Element Tree icon to see select them. You can also use the element tree to reorder how each element is positioned front-to-back. For the finished product we’ll want the image at the top, followed by the button, and then labels.
Style this new label with new text ‘NOT HOTDOG!’, set the Color to ‘yellow’ and the Background Color to ‘red’. Finally, position it identically to the first one (it should be directly on top of the first), and set both labels to Hidden by checking the check box in each.
The last element we need is another label that will show the text for the image tags returned. Drag out a third label from the elements drawer on the left. Style it and set constraints to match this:
Logic and API Requests
Importing the API
Start by importing your swagger .json file we exported after deploying our REST API. Click the API tab in the right bar, and then the + button.
Use the dropdown and add one of the PUT /saveImage and GET /getImageTags endpoints. We’ll configure them later.
Setting up Page Variables
We’ll next set up five page variables to hold some data and constructs we’ll use in our app.
Do this by selecting the Page Variables tab as shown in the image below.
The three page variables should be named like above and have types:
- imageData: Data Types – Swift – String
- imageID: Data Types – Swift – String
- imageTagPollingTImer: Data Types – Foundation – Timer
Setting up actions and events
Next, double-click the button element on your page and select the events tab for it.
Click the manage button next to the ‘Tapped’ event. Add an action by clicking the plus button on the left. Search for ‘Select Image from Photos’ and choose between Camera and Photo Library. If you’ve got a iOS device you can use to test your app, the camera will work for you, but go with photo library if you’re going to test your app with the in-browser simulator. You can always come back here and edit this later.
Underneath the ‘Image Chosen’ sub-action we now have, we need to add six more actions and configure them like the three images below.
Set Value – Hide Label 1 to reset the UI
Set Value Hide Label 2 to reset the UI
Set Value – Hide Label 3 to reset the UI
Set Value – Unhide the image element Image 1
Set Value – Set the image we chose from the camera or photo library to display in Image 1
Show Progress Overlay – Show a loading state while we save the image and wait for tags describing it
Base 64 Encode Image – Encode the image as a Base64 string for sending to our API
We now need two events under the Image Encoded sub-action of Base64 Encode Image.
Set Value – Set the imageData page variable equal to the Base64 encoded string
Run API Request – Run the API request that saves the image data
Next, we’ll configure the PUT /saveImage call.
In the API tab (far right of the right-bar) click the PUT endpoint we created earlier in the list pictured below.
Select the Parameters tab, and choose “Body” if it’s not already selected in the dropdown. Set the three parameters using the following properties:
- albumName: Static Inputs – String – (enter any album name here)
- imageData: Page Variables – imageData
- userName: Static Inputs – String – (enter any username you’d like here)
Next, select the Events tab for the request, and click on the Manage button for the 200 response.
Add a ‘Set Value’ action here that will take the imageID returned when we save the image to the page variable we made. It should look like this.
We’re also going to create a timer that is going to be used to poll the API for image tags and return them when they’re available.
Add a Create Timer action and configure it like shown.
Under Tick, create a Run API Request action and specify the GET /getImageTags endpoint.
Now, outside the Create Timer action and below it, create a new Set Value action that saves the time to a page variable for referencing later like this.
That’s it for the PUT request, now close the action modal and click the back arrow next to the request name in the right bar. Select the GET request, click the Events tab, and click Manage next to the 200 response.
Create a Stop Timer action to stop the timer we just set up so it doesn’t continue running after we receive the response we wanted.
Create a Set Value action to set the image tags to the text value of Label 3
Create a Hide Progress Overlay action to stop the loading state we had running since we’re done with the request we need to make.
We’ll now decide which label to show hotdog, or not hotdog, by creating an If… Else conditional action.
Under the True event (if the API says we’ve got a hotdog), create a Set Value action to unhide Label 1 like this
Under the False event (no hotdog detected) create two Set Value actions to unhide the other Label 2 and 3.
That’s it for the app design and logic!
All that’s left is a few settings for app name and icon. You’ll find these in the left sidebar under Settings.
You can use this image for your app icon:
You can now test your app on your device or in-browser simulator by clicking the blue Test button in the top right corner of the editor. Remember that to test in the simulator, you’ll need to have configured your camera button Tapped event to choose images from the photo library, as the the simulator doesn’t have camera functionality.
I encourage you to be creative and build your own image recognition apps. The API is ready for you to edit and redeploy, and Dropsource is great for quickly iterating on your app.
If you’ve got any questions or something isn’t working for you, please reach out to me personally at email@example.com, on Twitter at @dperjar, or use the live chat support available with your Premium Project (and Premium trial) in the Dropsource editor.
NOTE: In this tutorial we cloned and expanded some of the basic functionality of a mock Not Hotdog app. A real Not Hotdog app (which is awesome) was built by SeeFood Technologies Inc. and is publicly available here.