Tools we love Vol.7: Diffgram
About the series
As you know, we at Humans in the Loop have a great love and appreciation of a well-designed annotation tool. After the great feedback on the reviews we published of our the best platforms on the market here and here, we decided that it’s time for a deep dive in some of our all-time favorites!
This article is the seventh from a series of 10 reviews which you can access on our blog.
The whole series is based on the premise of transparency and honesty and none of these reviews are sponsored. They are just our way to give props to the best teams out there working on making annotation easier for AI teams, and to share some of the know-how that we have been accumulating over the past few years as a professional annotation company.
As in previous reviews, our parameters are:
- price
- functions
- project management
- automation
If you have additional questions or want to get in touch with us to beta test or feature your tool in an upcoming article, feel free to email us at hello@humansintheloop.org!
Diffgram
Created in 2018 as an annotation platform by California-based tech entrepreneur Anthony Sarkis, Diffgram has evolved into a dataset management platform for AI, making it easy to control, store, and retrieve data. Throughout the past couple of years Diffgram has made great progress always with its customers in mind and it may have found its unique niche.
Currently the platform offers a free Explorer plan with limited labels, as well as Teams and Enterprise plans which come with additional automation features and priority technical support. Pricing is monthly and is available by demand.
Features
Tracking instances across video frames
Diffgram offers a standard interface which includes boxes, polygons, tags, points, and lines. Attributes can also be added, with the option of dropdown, multiple select, free text, and radio button. For the purposes of video annotation, there is a great feature for sequence tracking across frames of objects with different IDs which are interpolated between keyframes. For polygons, we are big fans of ‘turbo’ mode which auto-places points as one moves the mouse.
The formats supported are both images and video (up to 4k) while the export can be done in both JSON format as as TF records of the images and annotations. Both operations can be done from the UI as well as the SDK/API.
The annotation interface is not the most user-friendly one but the canvas can be resized and moved around depending on the annotator’s preferences. The good news is that in addition to its own interface, Diffgram offers integrations with many of the leading annotation tools since currently its focus is on dataset management rather than annotation.
Project management
Managing annotation projects through Diffgram is quite straightforward. User roles include Admins, Editors, Viewers, and Annotators (even though there is no way to set up ‘Supervisor’ or ‘QC’ roles). In addition, Diffgram can be integrated with external annotation workforce providers as ‘Controllers’.
Data is available to annotators from a shared pool, so faster annotators get access to more work. They can either ‘complete’ a task or ‘defer’ it which is quite useful in difficult annotation projects. The process for annotator management includes adding ‘Guides’ (using Markdown) for annotators working on a specific task as well as setting up ‘Awards’ which can be either ‘required’ in order to get access to a task (e.g. ‘Bounding boxes level 1’) or ‘granted’ by completing the task.
Statistics are really customizable and reports can be generated on the level of instances, files, events, and tasks, which can be grouped by date, user, task, and label.
Automation
Diffgram’s mission is to make data management easier for data science teams, considering the amount of time spent on compiling datasets, getting them annotated, extracting and transforming the various versions of the datasets, and iterating through the whole process again.
Through key integrations with Google GCP and Amazon AWS, users can connect Diffgram to their data source and set up long term Syncing so that any new files added to the dataset can go directly for annotation. In the same way, data which has been annotated in one task can be funneled to another task by copying or moving it.
The event-triggered Sync feature looks quite promising because in the future it might enable customized non-linear data flows as well as conditional relationships between datasets and tasks. The possibilities for setting the perfect data flow for each user’s needs then become endless — so we will be closely following Diffgram’s updates!