Organized by the Google Developers Group or GDG, DevFest is a global event that consists of a meetup and conference for all things related to development in Google based products. Sri Lanka has also witnessed DevFest for the past few years, courtesy of GDG Sri Lanka. Held on Sunday the 8th of October 2017 at SLIIT, Malabe, DevFest 2017 was a full day’s event.
The morning consisted of a code labs session where participants were guided through some of the best practices in the world of development. These sessions took on a number of domains ranging from Cloud, Android, and Machine Learning to Deep Learning. Participants at DevFest were given a first hand experience on various technologies and tools such as Firebase, Docker, TensorFlow, etc.
Following lunch, the participants of GDG DevFest 2017 all gathered in the main auditorium of SLIIT, Malabe for the afternoon sessions.
Kicking things off, we had Lucas Dixon.
Lucas, a part of the Google Ideas project (now called Jigsaw) spoke to us about the Conversation AI perspective. With the internet, you can be anywhere that you want. This also means that conversations on the Internet can be dirty. People can be harassed, and this is what his topic was about. For example, after witnessing harassment of others, 27% of users refrained from posting online and 13% stopped using an online service entirely.
This affects organizations as well. Companies such as IMDB rely on engagement as their key to success, but they’ve turned off comments on their site This is done in order to maintain a clean forum with no harassment. So how can we have good conversations? This is where a wicked problem happens. This is something that is difficult or impossible to solve. It’s hard to measure or claim success. A wicked problem can also affect PR or public engagement, culture and even decision making. Lucas gave an example of when they interviewed female tech entrepreneurs. By doing so he and his team were harassed to great extent.
At DevFest 2017 Sri Lanka, Lucas asked a number of important questions. For starters, can AI and Deep learning help find that Lost Utopia in the internet? That was indeed an interesting question. Could we use ML to let us know about conversations? This would be essential to understanding the emotional impact of language.
Can robots be emotional before being smart? Babies can detect emotions long before they can solve logic puzzles. If people can do it, then why can’t machines? The first question is what should the Machine Language look for? And what do you want to find? The first thing to find is the toxicity or a topic that makes you want to leave the discussion. Next up, you don’t know if a comment is funny or just offensive so machine language will have to be trained to differentiate the two. Context matters, the context sets the foundation for whether a statement is toxic or not.
In order to filter out toxic comments, the machine language must be fed a lump sum of toxic comments to build a database to process. A solution for this is crowdsourcing. You have to analyze bias and make sure your sentences are balanced as well. He then showed some examples of toxic comments on platforms such as Wikipedia.
With a few more examples, and also a link to the work that he and his team are doing, Lucas’s session came to an end.
Next up, we had Manikanthan Krishnamurthy
Manikanthan, a part of the Developers relations and ecosystem for South East Asia was on stage at DevFest to speak about what’s new with developer tools for Google. He started off by speaking about the developments with Android and Google Play. As such, the buggest announcement was Android 8.0 Oreo. Next up was ARCore: AR at an Android scale. This is AR for developers without the use of any additional hardware.
He then spoke about Android Instant Apps expanding support to over 500 million supported devices, changes and improvements to Android Things, Android Studio 3.0, Nearby Connections 2.0, Android Testing support library, and Google Play App Signing. They’ve also updated the Google App Engine with support for Java 8 and Python 3.6.
Manikanthan went on to explain about changes to Web and Chrome such as changes and updates to Payment Request API, Puppeteer and Firebase. The Firebase Dev Summit, he added, would take place at the end of October. With regard to TensorFlow and Assistant, they use a new library called deeplearn.js that harnesses power of GPUs.
The team is also working on bringing Google Assistant to any device, and also improving Actions on Google, bringing it to more countries. He also spoke about the new products by Google such as the Google Pixel 2, and Google PixelBook, all of which are a combination of hardware, software and AI. In conclusion, Manikanthan spoke about Certifications and new Udacity scholarships and the TL:DR Developer Show. He also encouraged the attendees at DevFest to take part in the GDD India which would take place on the 1st and 2nd of December in Bangalore, India.
We learnt about Machine Learning on Mobile from Priyanka Suduge
Priyanka, a Principal Software Engineer at Pearson started off his session at DevFest by explaining that apart from the big companies, there’s not much development to Machine Language on mobile. His session was quite interactive. Using Kahoot (kahoot.it), a tool used for survey, Priyanka gathered information from the audience about a number of topics such as those familiar with machine language, identifying various items on screen etc. A majority of the audience have studied a little of machine learning so they knew the fundamentals.
Machine learning is the field of study that gives computers the ability to learn without being explicitly programmed. Priyanka delved into the categories of machine learning such as unsupervised learning and supervised learning. When it comes to developing tools using machine language, depending on what you need doing, there are relevant libraries.
Using an example of identifying a zebra, where 11 people picked the wrong answer, Priyanka explained about an app he developed that would harness the power of machine language. His app makes the decision based on the images in his smartphone, without connecting to the internet. He went on to explain that on certain occasions, an app that runs on mobile for machine learning could sometimes be better than you. His session then went into a bit of a technical side where he spoke about what you can do when you embed intelligence into a device.
He proceeded to showcase an app that would filter real messages and junk messages based on machine learning. Rather than use keywords and other elements, you can simply use a machine learning algorithm to obtain the same results in an easier and less time consuming way. Using some machine learning libraries, he copies the app to his mobile. He shared his own mobile number with the audience and asked them to send him a text message. The app then filtered out messages based on their contents and moved those filtered ones into a Spam folder. With that, Priyanka’s session came to an end.
We learning about Golang from Hasitha Liyanage
Hasitha, the Senior Director, Technology and Architecture at Sysco LABS spoke about his new found interest and fascination with Golang. “Why do we have such a great language and why do people not use it?”. That was his first question. The way to learn Go is not to look up benchmarks or tutorials, but rather to do it firsthand. People have been speaking about Go quite a few times including last DevFest we attended.
Starting with a simple Hello World message, Hasitha spoke about the basic principles of how to code using Go. He emphasized on how easy it is to code with Go. For example, you don’t have to jump to a number of loops to get your code to run.
His session then delved into more technical aspects where he created a reusable function. Every directory is a package or module, he explained. Anything that starts with a capital letter is exported, others are private. Packages named main are applications and must implement a main() function, you can import other packages by name. The variable type is implied by initial value. There are no classes in Go either.
He also spoke about methods and structs. Further, in Go, there are no exceptions either, but rather its multiple return values (error handling). You can use goroutine (concurrency). Rather than using mutex locks or queues, you can use channels to communicate between goroutines.
Following a small break, we got some Linguistic Wisdom from the Crowd
Conducted by Michael Tseng from the Google Research and Machine Intelligence team, the session began with Michael talking about Language Technology and machine intelligence. Michael also spoke about the Google Knowledge Graph. This is a knowledge base used by Google to improve its search engine’s search results with semantic-search information collected from an array of sources.
Rather than typing, we are now asking Google about questions around the world. Google then has to understand it and display the results. That’s just turning a natural language query into a statement and pulling something from a graph. This is more complicated when it comes to people conversing. For example, of you ask the height of Barack Obama, and Google replies with the height and you then ask how old “he” is. Google will have to know who “he” is and continue the conversation from that point onwards.
Michael also spoke about the following points.
- Part of speech tagging
- Morphological analysis
- Syntactic parsing
- Mention clunking
- Entity Resolution
- Coreference resolution
The concept of a word is not as easy as it seems. We can train a system to find word boundaries. This is not only about finding the spaces in words. For example, the Thai language has no spaces or punctuation marks. Once you defined what a word is you would need to know what its role in the sentence is. That’s speech tagging. By morphological analysis, you can detect more about the sentence and the subjects in the sentence.
From there, you can figure out the relationship between subjects and elements in a sentence as well. All of this can be labeled, but its more complicated than tagging an image with a cat. Mentions are subjects or things that have been mentioned. We like to keep track about what we’ve been talking about. So Google has to keep track of what we’re talking about. This is where entity resolution comes into play. Coreference resolution is where the same subjects are identified and mark them. If you ask who is the president of the US and it replies trump and you ask who is his wife, Google should know who “he” is.
He then spoke about traditional linguistic annotations. This is not something that you can learn in one day and they need highly trained experts with extensive technical guidelines and who also use complex tasks and tools. You often need millions of samples to perfect natural language processing.
This is where crowdsourcing traditional linguistic annotation happened. This is where contributors would break down complicated tasks. Why? Well, linguistic annotation is a very difficult thing to do. If you can’t get professionals to perform these tasks, you need to understand to read between the lines. You need to identify entities, events and relationships. Using a sentence, he showed how co-reference is created for entities and events. You can also identify event relations and frames. Mental states can also be asserted by the structure of the sentence.
Basically, in conclusion, you should avoid forcing linguistic abstractions on non-experts. You should treat the crowd as experts in everyday language use. You should also develop reading comprehension as an evaluation metric and embrace ambiguity and discovery as part of the annotation process.
The last speaker at DevFest 2017 Sri Lanka was about Developing Crowdsourcing Applications
Presented by Christina Funk, a Software Engineer at Google, this session was about how Google looks at ways to get data. They gather dater smarter and in a more efficient manner. A significant problem that they face is that there’s so much information in the world, how do you find the most relevant results?
The goals of the team are to find meaningful data, identify and label them and then display the results. Using the power of human intelligence to generate datasets that can serve as ground truth for a variety of big data projects. This is human computation.
In order to carry out human computation, they use crowdsourcing. The goal is to collect and label information and data from large groups of people. The incentive can be being paid or unpaid. Why? To represent a diverse population. An example of Crowdsourcing is Waze. Another example is YouTube, where viewers can add subtitles or captions. Once agreed by other viewers, it would be the caption for the video. This can greatly aid those who are visually or audio impaired.
She spoke about the ways that bias can affect the data you’re collecting. This includes colors, numbers, prominence of a single option, scrolling, prompt phrasing and also hidden features. You also want to think contexts such as Culture and Vocabulary, network connectivity and platform. In order to have good data, you must ask good questions. What is a good question? Christina defined it as a question with reliable and consistent answers. For example, if you take two phrases such as “Where did you eat last week” and “Where did you go for your birthday”, the latter would give you a more detailed answer when compared to the former as the latter has a greater depth of explanation.
It’s important to realize that things change over time. You need to be constantly updating and reevaluating your data. What is good data? This is answered via active learning. Labeling data is expensive so you can use smart sampling. Active learning uses pool based or stream based selection methods. She spoke about practical concerns such as batch section and the shifting pool.
Christina then spoke about how to implement quality control:
- Rating guidelines. This involves an instruction tutorial and example-driven training.
- Asking Golden questions. Here you can get established answers from expert workers or via automatic generation. This is an indirect measure. To get the right number, use the right questions,
- Redundant Rating – This is where you ask the same question and have multiple raters
- Rater grades
- Edge Cases and Ambiguity.
In conclusion, Christina spoke about how they are partnering with local developers as they know the community best, in order to further enhance their work.
That brought an end to GDG DevFest Sri Lanka 2017
With the vote of thanks delivered, GDG DevFest 2017 came to an end. Overall, it was quite an entertaining and informative event, that catered to all with a development background. This includes entrepreneurs, to project managers and everyone else in the community. There were even T-shirts given to encourage more people to be a part of the GDG Community. After all, it is given “From Developers to Developers”.