Highlight

Những điều thú vị khi dùng Trí tuệ nhân tạo của Viettel

Những người dùng Internet tại Việt Nam thường lấy “chị Google” ra để… giải trí. Khi “chị” đọc văn bản hay chỉ đường cho người tham gia gi...

Thursday, July 4, 2019

Những điều thú vị khi dùng Trí tuệ nhân tạo của Viettel



Những người dùng Internet tại Việt Nam thường lấy “chị Google” ra để… giải trí. Khi “chị” đọc văn bản hay chỉ đường cho người tham gia giao thông, chất giọng của “chị” luôn khiến cho người ta thấy muốn cười. Thế nhưng, nếu dùng AI của Viettel, người nghe sẽ thấy khác…

Giọng đọc “như người thật” trên báo điện tử
Công nghệ chuyển văn bản thành giọng nói (text to speech) được ứng dụng vào nhiều lĩnh vực của cuộc sống, trong đó có báo nói. Loại hình này ra đời, đem lại cách tiếp cận thông tin mới cho độc giả nhưng bị đánh giá là “không khác gì chị Google”.
Cho đến khi báo điện tử Dân trí và nhiều báo điện tử khác phát triển báo nói từ công nghệ trí tuệ nhân tạo (AI) do Tập đoàn Viettel cung cấp, độc giả đã thực sự bị bất ngờ với chất giọng truyền cảm, mượt mà vang lên từ chiếc máy tính. Không những vậy, họ còn có thể chọn tông giọng mong muốn để cảm thấy gần gũi với địa phương của mình hơn như giọng nam, giọng nữ, giọng miền bắc, giọng miền nam.
“Công nghệ AI của Viettel phát triển dựa trên dữ liệu tổng hợp tiếng nói, tạo ra giọng đọc máy nhưng tự nhiên như giọng người. Chúng tôi đang phát triển giọng máy ngày càng đa dạng hơn. Ngoài giọng đọc theo giới tính và vùng miền, còn có giọng đọc theo độ tuổi như trẻ em, người già, thanh niên…” – Anh Nguyễn Hoàng Hưng, Phó trưởng phòng Khoa học Công nghệ, Trung tâm Không gian mạng của Viettel cho biết.
Sản phẩm này được đánh giá là một trong những công nghệ giọng đọc nhân tạo tốt nhất thế giới hiện nay. Công nghệ “text to speech” do Viettel phát triển có thể thay thế hoàn hảo cho phát thanh viên và người ghi âm chuyên nghiệp trong các công việc như đọc truyện, đọc báo, tổng đài trả lời thông tin tự động…
Nhưng đó chỉ là một ứng dụng rất nhỏ được phát triển từ công nghệ AI mà Viettel đã nghiên cứu và đưa vào thực tế. Trong lĩnh vực truyền thông - Marketing, Trung tâm không gian mạng đã tạo ra hệ thống Reputa có khả năng giám sát 100% các kênh trên không gian mạng, phát hiện, cảnh báo chính xác những điểm nóng dư luận và dấu hiệu khủng hoảng truyền thông của doanh nghiệp, các tổ chức và Chính phủ.
Social Listening không phải là một sản phẩm xa lạ đối với các doanh nghiệp, mà thậm chí trong thời đại ngày nay, khi mạng xã hội phát triển mạnh mẽ thì hầu hết các doanh nghiệp, tổ chức lớn đều sử dụng để kịp thời nhận diện khủng hoảng truyền thông. Nhưng Reputa của Viettel là hệ thống đầu tiên tại Đông Nam Á có khả năng giám sát thông tin trên video và báo giấy.
Anh Nguyễn Hoàng Hưng cho biết, việc giám sát thông tin trên video được dựa trên công nghệ lõi là chuyển đổi âm thanh từ video thành văn bản (speech to text). Reputa kết hợp thêm công nghệ xử lý ngôn ngữ tự nhiên, nhờ đó có thể xác định được thái độ của người nói là tích cực hay tiêu cực. Như vậy, từ những từ khóa mà khách hàng mong muốn, Reputa có thể nhận dạng nội dung của đoạn video trên Internet, và quan trọng hơn cả, là nội dung đó có tiêu cực hay không.
Trong khi đó, hoạt động giám sát thông tin trên báo giấy dựa trên việc xử lý hình ảnh. Từ dữ liệu ảnh chụp hoặc scan trang báo, công nghệ chuyển đổi hình ảnh thành văn bản sẽ được sử dụng để hệ thống phát hiện những nội dung mà doanh nghiệp cần theo dõi.
… Và căn bệnh nguy hiểm tại Việt Nam
Khoảng hơn 1 tháng nay, tại một số bệnh viện, các bệnh nhân đến nội soi đường tiêu hóa không còn phải ngồi chờ cả tiếng đồng hồ để lấy kết quả như trước mà chỉ mất vài phút. Đó là nhờ Hệ thống hỗ trợ chẩn đoán ảnh nội soi của Viettel - Sản phẩm đầu tiên tại Việt Nam ứng dụng Trí tuệ nhân tạo vào lĩnh vực nội soi.


“Tất cả các hình ảnh chụp được sau khi nội soi sẽ được đưa vào hệ thống của Viettel, phân loại ảnh vào các vị trí của đường tiêu hóa. Sau đó, khoanh vùng những chỗ mà nó đánh giá là có dấu hiệu tổn thương, đánh giá mức độ tổn thương và cuối cùng là trả kết quả cho bác sỹ” – chị Đặng Quỳnh Anh, kỹ sư nghiên cứu thuật toán, Ban Nghiên cứu Ứng dụng Toán học (Viettel) cho biết.
Như vậy, thay vì phải mất nhiều thời gian để xem hình ảnh bằng mắt thường và đánh giá tổn thương bằng kinh nghiệm thì giờ đây các bác sỹ chỉ cần nhìn vào kết quả mà hệ thống đưa ra, kiểm tra lại và đưa ra phương án điều trị. Không chỉ tiết kiệm thời gian và sức lực cho các bác sỹ, sản phẩm AI này cũng làm tăng tỷ lệ chính xác khi xác định tổn thương, tránh bỏ sót những tổn thương ở giai đoạn mới chớm.
“Sau này hệ thống phát triển hơn, thậm chí sẽ không cần bác sỹ kiểm tra lại nữa mà có thể trả luôn kết quả cho bệnh nhân” – chị Quỳnh Anh khẳng định.
Bệnh dạ dày là một căn bệnh nghe có vẻ đơn giản nhưng thực tế, đó là bệnh phổ biến nhất ở Việt Nam, đồng thời ung thư dạ dày cũng là loại bệnh đứng thứ 3 về tỷ lệ tử vong. Theo con số thống kê, mỗi năm Việt Nam có khoảng trên 17.000 ca ung thư dạ dày mới mắc, số người tử vong khoảng 15.000 ca. Tuy nhiên, điều nguy hiểm của loại bệnh này là dấu hiệu phát bệnh ở giai đoạn đầu không rõ ràng nên hầu hết được phát hiện ra ở giai đoạn muộn.
Chính vì vậy, trong Y tế, Viettel lựa chọn phát triển công nghệ AI đầu tiên trong lĩnh vực nội soi đường tiêu hóa, nhằm hỗ trợ các bác sỹ phát hiện ra bệnh sớm hơn, tăng khả năng chữa bệnh. Khi được triển khai rộng rãi, giải pháp này của Viettel được kì vọng sẽ giúp thời gian trả kết quả chẩn đoán nhanh hơn gấp 5 lần, tăng tỉ lệ phát hiện ung thư dạ dày ở giai đoạn sớm lên gấp 3 lần so với trước đây, đồng thời hỗ trợ đào tạo bác sĩ trẻ. Nhờ vào ứng dụng Trí tuệ nhân tạo, trong tương lai, người dân sẽ được hưởng dịch vụ y tế chất lượng cao ở mọi nơi.
“Do hiện tại Viettel chỉ tiếp cận được nguồn dữ liệu ảnh nội soi đường tiêu hóa từ Viện Gan mật nên công nghệ AI mới chỉ dừng lại trong lĩnh vực này. Nhưng sau này, chắc chắn chúng tôi sẽ mở rộng ứng dụng sang các loại hình khác” – chị Quỳnh Anh tiết lộ.
Sứ mệnh Kiến tạo cuộc sống số của Viettel
Trở lại với “chị Google”, anh Nguyễn Hoàng Hưng nói, công nghệ Text to Speech mà một tập đoàn toàn cầu như Google đưa ra chỉ nhằm đáp ứng nhu cầu giải đáp thông tin và không ưu tiên tiếng Việt. Trong khi đó, Viettel – một Tập đoàn của Việt Nam đã đi vào “thị trường ngách” và tạo ra được sản phẩm chất lượng hơn và có thể ứng dụng đa dạng hơn để phục vụ cho người Việt.
Trong nhiều ứng dụng của AI, Viettel đã chọn đưa vào ứng dụng thực tế 7 sản phẩm trí tuệ nhân tạo đầu tiên: Cyberbot – trợ lý ảo chăm sóc khách hàng, Báo nói, Hệ thống quản lý danh tiếng và thương hiệu REPUTA, Hệ thống hỗ trợ chẩn đoán ảnh nội soi, Hệ thống theo dõi và cảnh báo biến động diện tích rừng, Hệ thống chống tấn công từ chối dịch vụ DDOS và Hệ thống camera giám sát giao thông.
Việc phát triển các sản phẩm AI nói trên là những câu trả lời đầu tiên của Viettel cho bài toán “Kiến tạo cuộc sống số” tại Việt Nam. Không chỉ là sự chuyển mình thức thời theo xu hướng phát triển công nghệ trên thế giới, Viettel lựa chọn cho ra đời các sản phẩm trí tuệ nhân tạo có khả năng giải quyết những vấn đề thực tiễn cần thiết nhất của xã hội Việt Nam.
Doanh nghiệp cần chuyên nghiệp hóa, tiết kiệm chi phí trong chăm sóc khách hàng và quản lý truyền thông thương hiệu. Người dân cần dịch vụ y tế chất lượng cao. Tài nguyên rừng, tài nguyên môi trường của đất nước cần được bảo vệ. Chính phủ cần xây dựng hệ thống giao thông thông minh. Và mọi cá nhân, tổ chức đều cần được bảo vệ trong môi trường mạng….
Đó cũng là những điều chứng tỏ vị thế cùng trách nhiệm dẫn dắt ngành công nghệ thông tin của đất nước mà một Tập đoàn lớn như Viettel đặt trên vai. Bắt đầu từ 7 sản phẩm AI nói trên, Viettel có khát vọng trở thành điểm hội tụ và lan tỏa những sản phẩm ứng dụng công nghệ số đến mọi ngõ ngách của đời sống xã hội, cung cấp những giải pháp mang giá trị toàn cầu, đưa Việt Nam đồng hành với thế giới trong cuộc cách mạng 4.0.
Nguyễn Quỳnh - ictnews

3 Ways Artificial Intelligence Has Sparked Marketing and Sales Transformation

3 Ways Artificial Intelligence Has Sparked Marketing and Sales Transformation

Artificial intelligence, or AI as it's called, has been a buzzword for nearly a decade already, yet sometimes it still feels as though we’re just in the early stages of discovering what predictive analytics and machine learning can do for enterprises.
Nowhere is this truer than in marketing and sales functions. According to Forrester, as of 2017 marketing and sales accounted for more than 50 percent of all AI investments.
But when you look at investors who have already sunk serious money into AI projects, only 45 percent have seen any results at all. And among those who are seeing results, 25 percent agree that they’ve become more effective in their business processesThese discouraging numbers paint a vivid picture: Most marketing and sales teams simply aren’t properly equipped to implement AI.
Here are three ways in which AI has completely transformed enterprise sales and marketing in the 21st century for at least some companies:

1. Predicting outcomes to increase lead generation

Marketing is by nature a very competitive and data-driven endeavor, especially at the enterprise level. Every facet of global, cross-channel marketing relies heavily on a competent knowledge economy comprised of data inputs (and proactive recommendations) gathered at every touchpoint with visitors, leads, and customers.
A great example of an effective AI-powered marketing engine was put together by CenturyLink, which provides cloud and security solutions to digital businesses. Before implementing AI, CenturyLink already had a sales team of about 1,600 people to handle all its incoming leads, and even that number was barely enough to meet the demand, according to Harvard Business Review.
"Angie," a Conversica AI virtual assistant, was hired to do a simple job: comb through thousands of leads, send them emails, and determine which leads were "hot" and which were not. If she found a quality lead, she would entrust it to a human salesperson.
So, Angie set to work right away and started sending out 30,000 emails per month.
As it turned out, Angie has been extremely good at her job. Not only does she consistently find about 40 new hot leads per week, but she is also able to understand 99 percent of the email replies she receives from customers. The 1 percent she doesn’t understand are forwarded to her human manager. Turns out Angie is also good at routing the right leads to the right reps.
All in all, CenturyLink has earned $20  for every $1 it's spent on Angie: That's an impressive 1,900 percent ROI.

2. Recommending next steps and resolving issues

Another major way AI has helped enterprise marketing and sales teams is with the customer journey and customer support -- integral parts of the marketing and sales life cycle.
As an example, the printing giant Epson America was drowning in leads and didn’t know how to handle them anymore. Where CenturyLink was using its AI assistant to find and qualify leads (i.e., marketing), Epson had no problem with marketing or outreach -- if anything, the company was too good at this. It receives, on average, 50,000 leads per year.
In no time at all, Epson realized the force-multiplier potential of AI. It also realized that the AI virtual assistant could help its human sales teams with cross-selling, upselling and recurring orders, as well. It could also discover and report unresolved customer issues to the right customer support reps immediately.
Before it implemented its AI sales assistant, Epson had been accustomed to seeing around 20 responses for every 100 leads. But since implementing the Conversica AI, it now receives a staggering 51 responses --  a 240 percent increase -- as well as a 75 percent increase in qualified leads.
This led to an additional $2 million in revenue in the first 90 days of using that AI.

3. Creating disposable content, even advertising

Perhaps the most surprising AI implementation I've seen at the enterprise level is the ability of AI to replicate human-like writing and content creation. Researchers have spent decades trying to make computers write the way humans do, but only in recent years have they been successful. For certain types of disposable content, AI has already shown itself to be more effective than a content team would be.
The examples of AI writers emerging in the worlds of publishing are numerous: Forbeswriters are planning on using an AI to help them pen their drafts; a slew of newspapers, including the New York Times, and the wire service Reuters, use AI to write real-time financial reports and sports recaps (called "robot reporters"); and OpenAI claims to have developed an AI writer so good that it is too dangerous for public release.
Even more incredible, Toyota recently used IBM Watson’s machine-learning capabilities to design a new kind of advertising campaign for the Rav4: one that generates entire ad scripts. They’ve even given it a catchy label: creative programmatic.
According to Ad Age, Toyota gave IBM Watson a list of the top 1,000 recreational activities that you might use a car to get to and tasked Watson with pairing the activities together in unexpected and intriguing ways. Watson’s video script outputs were then fed to another AI tool, a video generator that stitches together stock and original footage, called Imposium. The final results: 300 unique, targeted video advertisements that were used as Facebook and Instagram.

Toto, I don’t think we’re in Kansas anymore.

In the 21st century, AI is starting to look less and less like science fiction and more like the times we’re living in. For enterprises that actually have the budgets to implement and experiment with AI pilot programs for marketing and sales -- as we've seen, the sky’s the limit.
If I’ve learned one thing from running various digital marketing transformations that also better align sales and marketing, it’s that AI is making its way into every part of the enterprise technology stack. The ROI is there. Its short- and long-term impact can be tremendous.
But don’t underestimate the time and effort required. Take inspiration from other brands that have hit the ball out of the park with AI initiatives. Keep things simple for a pilot, stay agile during implementation, and make sure to hire the right team for the job. You will see results.

7 Types Of Artificial Intelligence

uncaptioned

Artificial Intelligence is probably the most complex and astounding creations of humanity yet. And that is disregarding the fact that the field remains largely unexplored, which means that every amazing AI applicationthat we see today represents merely the tip of the AI iceberg, as it were. While this fact may have been stated and restated numerous times, it is still hard to comprehensively gain perspective on the potential impact of AI in the future. The reason for this is the revolutionary impact that AI is having on society, even at such a relatively early stage in its evolution.
AI’s rapid growth and powerful capabilities have made people paranoid about the inevitability and proximity of an AI takeover. Also, the transformation brought about by AI in different industries has made business leaders and the mainstream public think that we are close to achieving the peak of AI research and maxing out AI’s potential. However, understanding the types of AI that are possible and the types that exist now will give a clearer picture of existing AI capabilities and the long road ahead for AI research.

Understanding the types of AI classification

Since AI research purports to make machines emulate human-like functioning, the degree to which an AI system can replicate human capabilities is used as the criterion for determining the types of AI. Thus, depending on how a machine compares to humans in terms of versatility and performance, AI can be classified under one, among the multiple types of AI. Under such a system, an AI that can perform more human-like functions with equivalent levels of proficiency will be considered as a more evolved type of AI, while an AI that has limited functionality and performance would be considered a simpler and less evolved type.
Based on this criterion, there are two ways in which AI is generally classified. One type is based on classifying AI and AI-enabled machines based on their likeness to the human mind, and their ability to “think” and perhaps even “feel” like humans. According to this system of classification, there are four types of AI or AI-based systems: reactive machines, limited memory machines, theory of mind, and self-aware AI.
uncaptioned
ALLERIN

1.    Reactive Machines

These are the oldest forms of AI systems that have extremely limited capability. They emulate the human mind’s ability to respond to different kinds of stimuli. These machines do not have memory-based functionality. This means such machines cannot use previously gained experiences to inform their present actions, i.e., these machines do not have the ability to “learn.” These machines could only be used for automatically responding to a limited set or combination of inputs. They cannot be used to rely on memory to improve their operations based on the same. A popular example of a reactive AI machine is IBM’s Deep Blue, a machine that beat chess Grandmaster Garry Kasparov in 1997.

2.    Limited Memory

Limited memory machines are machines that, in addition to having the capabilities of purely reactive machines, are also capable of learning from historical data to make decisions. Nearly all existing applications that we know of come under this category of AI. All present-day AI systems, such as those using deep learning, are trained by large volumes of training data that they store in their memory to form a reference model for solving future problems. For instance, an image recognition AI is trained using thousands of pictures and their labels to teach it to name objects it scans. When an image is scanned by such an AI, it uses the training images as references to understand the contents of the image presented to it, and based on its “learning experience” it labels new images with increasing accuracy.
Almost all present-day AI applications, from chatbots and virtual assistants to self-driving vehicles are all driven by limited memory AI.

3.    Theory of Mind

While the previous two types of AI have been and are found in abundance, the next two types of AI exist, for now, either as a concept or a work in progress. Theory of mind AI is the next level of AI systems that researchers are currently engaged in innovating. A theory of mind level AI will be able to better understand the entities it is interacting with by discerning their needs, emotions, beliefs, and thought processes. While artificial emotional intelligence is already a budding industry and an area of interest for leading AI researchers, achieving Theory of mind level of AI will require development in other branches of AI as well. This is because to truly understand human needs, AI machines will have to perceive humans as individuals whose minds can be shaped by multiple factors, essentially “understanding” humans.

4.    Self-aware

This is the final stage of AI development which currently exists only hypothetically. Self-aware AI, which, self explanatorily, is an AI that has evolved to be so akin to the human brain that it has developed self-awareness. Creating this type of Ai, which is decades, if not centuries away from materializing, is and will always be the ultimate objective of all AI research. This type of AI will not only be able to understand and evoke emotions in those it interacts with, but also have emotions, needs, beliefs, and potentially desires of its own. And this is the type of AI that doomsayers of the technology are wary of. Although the development of self-aware can potentially boost our progress as a civilization by leaps and bounds, it can also potentially lead to catastrophe. This is because once self-aware, the AI would be capable of having ideas like self-preservation which may directly or indirectly spell the end for humanity, as such an entity could easily outmaneuver the intellect of any human being and plot elaborate schemes to take over humanity.
The alternate system of classification that is more generally used in tech parlance is the classification of the technology into Artificial Narrow Intelligence (ANI), Artificial General Intelligence (AGI), and Artificial Superintelligence (ASI).

5.    Artificial Narrow Intelligence (ANI)

This type of artificial intelligence represents all the existing AI, including even the most complicated and capable AI that has ever been created to date. Artificial narrow intelligence refers to AI systems that can only perform a specific task autonomously using human-like capabilities. These machines can do nothing more than what they are programmed to do, and thus have a very limited or narrow range of competencies. According to the aforementioned system of classification, these systems correspond to all the reactive and limited memory AI. Even the most complex AI that uses machine learning and deep learning to teach itself falls under ANI.

6.    Artificial General Intelligence (AGI)

Artificial General Intelligence is the ability of an AI agent to learn, perceive, understand, and function completely like a human being. These systems will be able to independently build multiple competencies and form connections and generalizations across domains, massively cutting down on time needed for training. This will make AI systems just as capable as humans by replicating our multi-functional capabilities.

7.    Artificial Superintelligence (ASI)

The development of Artificial Superintelligence will probably mark the pinnacle of AI research, as AGI will become by far the most capable forms of intelligence on earth. ASI, in addition to replicating the multi-faceted intelligence of human beings, will be exceedingly better at everything they do because of overwhelmingly greater memory, faster data processing and analysis, and decision-making capabilities. The development of AGI and ASI will lead to a scenario most popularly referred to as the singularity. And while the potential of having such powerful machines at our disposal seems appealing, these machines may also threaten our existence or at the very least, our way of life.

At this point, it is hard to picture the state of our world when more advanced types of AI come into being. However, it is clear that there is a long way to get there as the current state of AI development compared to where it is projected to go is still in its rudimentary stage. For those holding a negative outlook for the future of AI, this means that now is a little too soon to be worrying about the singularity, and there's still time to ensure AI safety. And for those who are optimistic about the future of AI, the fact that we've merely scratched the surface of AI development makes the future even more exciting.

Friday, September 14, 2018

How to extract building footprints from satellite images using deep learning

I work with our partners and other researchers inside Microsoft to develop new ways to use machine learning and other AI approaches to solve global environmental challenges. In this post, we highlight a sample project of using Azure infrastructure for training a deep learning model to gain insight from geospatial data. Such tools will finally enable us to accurately monitor and measure the impact of our solutions to problems such as deforestation and human-wildlife conflict, helping us to invest in the most effective conservation efforts.

Azure Guides, Azure Learning, Azure Certifications, Azure Tutorial and Materials

Applying machine learning to geospatial data


When we looked at the most widely-used tools and datasets in the environmental space, remote sensing data in the form of satellite images jumped out.

Today, subject matter experts working on geospatial data go through such collections manually with the assistance of traditional software, performing tasks such as locating, counting and outlining objects of interest to obtain measurements and trends. As high-resolution satellite images become readily available on a weekly or daily basis, it becomes essential to engage AI in this effort so that we can take advantage of the data to make more informed decisions.

Geospatial data and computer vision, an active field in AI, are natural partners: tasks involving visual data that cannot be automated by traditional algorithms, abundance of labeled data, and even more unlabeled data waiting to be understood in a timely manner. The geospatial data and machine learning communities have joined effort on this front, publishing several datasets such as Functional Map of the World (fMoW) and the xView Dataset for people to create computer vision solutions on overhead imagery.

An example of infusing geospatial data and AI into applications that we use every day is using satellite images to add street map annotations of buildings. In June 2018, our colleagues at Bing announced the release of 124 million building footprints in the United States in support of the Open Street Map project, an open data initiative that powers many location based services and applications. The Bing team was able to create so many building footprints from satellite images by training and applying a deep neural network model that classifies each pixel as building or non-building. Now you can do exactly that on your own!

With the sample project that accompanies this blog post, we walk you through how to train such a model on an Azure Deep Learning Virtual Machine (DLVM). We use labeled data made available by the SpaceNet initiative to demonstrate how you can extract information from visual environmental data using deep learning. For those eager to get started, you can head over to our repo on GitHub to read about the dataset, storage options and instructions on running the code or modifying it for your own dataset.

Semantic segmentation


In computer vision, the task of masking out pixels belonging to different classes of objects such as background or people is referred to as semantic segmentation. The semantic segmentation model (a U-Net implemented in PyTorch, different from what the Bing team used) we are training can be used for other tasks in analyzing satellite, aerial or drone imagery – you can use the same method to extract roads from satellite imagery, infer land use and monitor sustainable farming practices, as well as for applications in a wide range of domains such as locating lungs in CT scans for lung disease prediction and evaluating a street scene.

Azure Guides, Azure Learning, Azure Certifications, Azure Tutorial and Materials
Illustration from slides by Tingwu Wang, University of Toronto

Satellite imagery data


The data from SpaceNet is 3-channel high resolution (31 cm) satellite images over four cities where buildings are abundant: Paris, Shanghai, Khartoum and Vegas. In the sample code we make use of the Vegas subset, consisting of 3854 images of size 650 x 650 squared pixels. About 17.37 percent of the training images contain no buildings. Since this is a reasonably small percentage of the data, we did not exclude or resample images. In addition, 76.9 percent of all pixels in the training data are background, 15.8 percent are interior of buildings and 7.3 percent are border pixels.

Original images are cropped into nine smaller chips with some overlap using utility functions provided by SpaceNet. The labels are released as polygon shapes defined using well-known text (WKT), a markup language for representing vector geometry objects on maps. These are transformed to 2D labels of the same dimension as the input images, where each pixel is labeled as one of background, boundary of building or interior of building.

Azure Guides, Azure Learning, Azure Certifications, Azure Tutorial and Materials

Some chips are partially or completely empty like the examples below, which is an artifact of the original satellite images and the model should be robust enough to not propose building footprints on empty regions.

Azure Guides, Azure Learning, Azure Certifications, Azure Tutorial and Materials

Training and applying the model


The sample code contains a walkthrough of carrying out the training and evaluation pipeline on a DLVM. The following segmentation results are produced by the model at various epochs during training for the input image and label pair shown above. This image features buildings with roofs of different colors, roads, pavements, trees and yards. We observe that initially the network learns to identify edges of building blocks and buildings with red roofs (different from the color of roads), followed by buildings of all roof colors after epoch 5. After epoch 7, the network has learnt that building pixels are enclosed by border pixels, separating them from road pixels. After epoch 10, smaller, noisy clusters of building pixels begin to disappear as the shape of buildings becomes more defined.

Azure Guides, Azure Learning, Azure Certifications, Azure Tutorial and Materials

A final step is to produce the polygons by assigning all pixels predicted to be building boundary as background to isolate blobs of building pixels. Blobs of connected building pixels are then described in polygon format, subject to a minimum polygon area threshold, a parameter you can tune to reduce false positive proposals.

Training and model parameters


There are a number of parameters for the training process, the model architecture and the polygonization step that you can tune. We chose a learning rate of 0.0005 for the Adam optimizer (default settings for other parameters) and a batch size of 10 chips, which worked reasonably well.

Another parameter unrelated to the CNN part of the procedure is the minimum polygon area threshold below which blobs of building pixels are discarded. Increasing this threshold from 0 to 300 squared pixels causes the false positive count to decrease rapidly as noisy false segments are excluded. The optimum threshold is about 200 squared pixels.

The weight for the three classes (background, boundary of building, interior of building) in computing the total loss during training is another parameter to experiment with. It was found that giving more weights to interior of building helps the model detect significantly more small buildings (result see figure below).

Azure Guides, Azure Learning, Azure Certifications, Azure Tutorial and Materials

Each plot in the figure is a histogram of building polygons in the validation set by area, from 300 square pixels to 6000. The count of true positive detections in orange is based on the area of the ground truth polygon to which the proposed polygon was matched. The top histogram is for weights in ratio 1:1:1 in the loss function for background : building interior : building boundary; the bottom histogram is for weights in ratio 1:8:1. We can see that towards the left of the histogram where small buildings are represented, the bars for true positive proposals in orange are much taller in the bottom plot.

Last thoughts


Building footprint information generated this way could be used to document the spatial distribution of settlements, allowing researchers to quantify trends in urbanization and perhaps the developmental impact of climate change such as climate migration. The techniques here can be applied in many different situations and we hope this concrete example serves as a guide to tackling your specific problem.

Another piece of good news for those dealing with geospatial data is that Azure already offers a Geo Artificial Intelligence Data Science Virtual Machine (Geo-DSVM), equipped with ESRI’s ArcGIS Pro Geographic Information System. We also created a tutorial on how to use the Geo-DSVM for training deep learning models and integrating them with ArcGIS Pro to help you get started.

Finally, if your organization is working on solutions to address environmental challenges using data and machine learning, we encourage you to apply for an AI for Earth grant so that you can be better supported in leveraging Azure resources and become a part of this purposeful community.

Thursday, September 13, 2018

Industrial Internet of Things: what, how and why?

Industrial-Internet-of-Things-what,-how-and-why

There’s a rather new concept floating around and if you had a tough time understanding the Internet, then the Internet of Things and Industrial Internet of Things might pose an even bigger challenge. Also, no final definition of these concepts has seen the light of day yet, even though people have been talking about them for quite some time.
Having access to the world through a device is yesterday’s news, but what if this device could connect to another device and ‘talk’ back and forth? That is gathering, working with and finally exchanging information even without active human intervention. Does this sound like a good idea for industry or/and everyday life? Connectivity, all around connectivity that is, seems to be what the future has in store for us.
The final definition of IoT will not be given by what it can do, but by what it can be made to do. Namely, connecting a couple of machines in a network to the internet is no big deal with today’s technology; the goal is to have them achieve something as a result of this interconnectivity. Anything or indeed everything could be sent through this channel, what matters is what we receive back.
From tracking the whereabouts of your cat to automatizing complex industrial processes, all could be done from this platform. As long as your device is connected to the internet and its configuration permits it to gather, send and process data, we could say that it is a part of the IoT.
All of this leads of course to a digital transformation. The field of interest for us is, as always, the industrial one. Bosch and SAS for example describe the IoT as more or less a system based on devices (be they wearable or big production machines) that are fitted with sensors that can gather data and intelligently act on it in order to produce new business models, new knowledge and finally as a result: new services.
Think about the Industrial Internet of Things as something less personal, more like large scale implications. The sectors that it usually covers are manufacturing, transportation, oil and gas, energy/utilities and many others which deal with big business solutions. It all started with the focus on optimization and automation and it’s expected that the whole of the Industrial Internet of Things market will reach USD 123.89 billion by 2021. Imagine this is a mix of machines, computers and human operators all working together with data in order to transform their business into something better.
The biggest industry to be impacted by this trend is manufacturing, also the biggest spender on software, hardware, services and connectivity. On second place we have the transportation sector, largely investing in advanced communication and monitoring systems. Last but not least, the energy and utilities sector focuses on oil and gas exploration and smart grid which is the key in supply and network transmission/distribution. Apart from the big three, it’s worth noting the application of Industrial Internet of Things in healthcare, robotics and mining.
Since we’re talking about internet and industry, another concept that hits home is Industry 4.0. Regarded as the 4th major industrial revolution it really is about the aforementioned digital transformation, a decided shift towards cyber-physical systems.
The real goal of this new concept is finally customization, presenting every industry with means to personalize production, servicing and producer/consumer interaction, apart from the already mentioned cost efficiency and innovative services. In this aspect the IoT serves as the binding interface between all these systems: cloud computing, big data, artificial intelligence, data communication, programmable logic controllers and many others.
But is everybody on board with this new era, or an even better question: why wouldn’t they be? Apparently, the reason is a lack of skills. Seeing as the digital revolution is already here, maybe it’s time to invest in better training strategies or even look outside the box, as strategic partnerships could provide access to all necessary know how. Another concern is that of security. In a free for all network you should be right to worry about who may have access to your data.