First Telenor-NTNU AI-Lab Hackathon

– Hacking Telco related data with Machine Learning at the First Telenor-NTNU AI-Lab Hackathon

Join the first Telenor-NTNU AI-Lab hackathon!
You will get access to real data, code, networks, and lots of fun.

  • When: Friday 17 – Saturday 18 March
  • Where: Telenor-NTNU AI-Lab, IT-bygget at Gløshuagen
  • For who: NTNU staff and students

Register for the Hackathon

Registration deadline: 16 March 2017, 12:00 PM CET
Limited to 50 participants

Programme

Day 1 – Friday 17 March
Time Agenda
09.00 – 09.30 Welcome
09.30 – 10.30

Presentation of:

  • Topics, Datasets and Rules 
  • Awards and Evaluation Criteria
  • Awards Committee (Judging Panel)
10.30 – 11.30 Pitching Ideas and Team Formation + Lunch
11.30 – 12.00 Presentation of Mentors and Workspace assignment 
12.00 Hacking starts
16.00 Delivery / Discussion of the First Report
Draft: Description / Overview of the idea, report on progress
17.00 Pizza
17.00 –  Hacking all night long

 

Day 2 – Saturday 18 March
Time Agenda
09.00 Breakfast
11.00 Second Report Draft mostly focused on progress
12.00 Lunch
14.00 Deadline for Submitting Demos / Code / Complete Synopsis
14.30 – 16.00 ShowCase / Presentations / Demos
16.30 – 17.00 Winners are announced
18.00 –  After Hackathon party with pizza and refreshments

Rules

  • Teams must be composed of (maximum) 4 members.
    Teams will be formed in the Team Formation time (First Day).
  • Once the team is composed, the name, surname, email of the members must be sent via email to the mentors. A title and the initial short description of the project must be provided including the type of dataset to use.
  • Mentors will be available from 12–16 on the first day and from 12–14 on the second day.

 

Awards

Teams will be awarded with 3 prizes based on creativity and innovativeness of the proposed solution as well as quality of the presentation:

  • First prize: NOK 8000,-
  • Second prize: NOK 6000,-
  • Third prize: NOK 4000,-

Other 2 prizes will be available for:

  • The best teamwork: NOK 1000,-
  • The solution with best business impact: NOK 1000,-

 

Github

https://github.com/ntnutelenorhackathon

 

DATASET

Dataset 1: Text Analysis for customer care from Social Media Activity

Scenario

People use to contact customer care of Telenor by using social media like FB and Twitter. They often post specific problems.

Related problems can be: real time sentiment analysis over the posts, trends and topic analytics from the comments/post of the customer writing on the page, automatic question/answering of most common problems.

Data

Facebook posts from the facebook page https://www.facebook.com/telenornorge/ until October 2016. The most important fields are the following:

Status_id: unique identifier for the status

Status_message: textual message

Status_author: name of the author of the post (can be also a post from Telenor Norge)

Link_name: Title of the link in case status_type is link

Status_type: photo/link/status according to the content

Status_link: url of the link in case status_type is link

Status_published: timestamp related to the published time

Num_reactions: number of reactions to the post (sum of the last 6 fields)

Num_comments: number of comments to the post

Num_shares: number of shares to the post

Num_likes, Num_loves, Num_wows,Num_hahas,Num_sads,Num_angrys: details on the specific reactions from the users

Dataset 2: Customer Care Forecasting

Business scenario

In order to have the correct number of customer agents available in order to answer calls within 60 seconds it is necessary to create a forecast of calls coming in every 15 minutes.

The forecasting period is 6 weeks ahead.

Data

Here the fields of the collected data. Each row represents a call from the customer:

Call_Date: Date of call

Time: Time of call summarised number of calls every 15 minutes

Service: Which que the caller has been assigned to

Client: The product the customer is calling about

Program: Agent group with specific skillset

Type: The que type [order, invoice, tecnhical, ...]

Offered_calls: Target to be forecasted. Number of calls offered from the IVR (Interaction Voice Responder)

Answered_calls: The number of Offered_calls answered

Lost_calls: The number of Offered_calls not answered

WT_60: Waiting time - answered calls within 60 seconds.

The dataset has been filtered to the most important Program and Type. The Type after filtering is all the same although they have different values. Last 6 weeks of dataset should be used as a validation set. The output dataset should contain [Call_Date, Time, Offered_calls].

The model could be evaluated on Root Mean Squared Error.

Dataset 3: The Heartbeat of a city

Scenario

The number of mobile phones connected to a base station is a proxy for the number of people in a geographic area. By analysing this data we can better understand how people move around in space and in general can be very useful for understanding the dynamics of cities: when are people using certain areas, how does events disrupt the daily usage patterns, and how does the city center differ from surrounding areas.

Ideas for  using this dataset:

  • Visualize the spatio-temporal dynamics of the data.

  • Analyze and classify areas based on cell tower activity. E.g.: Do some areas have similar dynamics?

Relevant background research: http://www.nature.com/articles/srep05276

Data

Number of phones connected to Oslo region base stations. Sampled once per hour one week of October. The data are in the following format:

cell_easting

cell_northing

subsperbase

date_trunc

 

where:

Cell_easting : UTM N33 coordinate east of base station location

Cell_northing: UTM N33 coordinate northing of base station location

Subprbase:  Number of mobile phones seen on this bas station

Timesstamp: Timestamp

Dataset 4 - Open-source Security Intelligence

Business scenario

Open-source Security Intelligence (OSINT) is about to gather information, analyze it to reveal insights or intelligence with the ambition to identify, understand and even predict security trends, risks and future cyber attacks. Common OSINT sources include social networks, forums, business websites, blogs, videos, and news sources. Much of it is available only on deep webs and dark sites. Different tasks within OSINT where we see that AI/ML can bring some advances are:

  1. Classification of security-relevant contents from deep webs and dark sites,

  2. Security trends extraction from media and visualization.

​Data

Two different types of data can be provided:

  • Security news from media, such as WireNews, ThreatAttackNews, InfosecurityMagazine
  • Posts from deep webs and dark sites, such as SilkRoad, AlphaBay, TheRealDealMarket, Pastebin

On the data format:

  • For the security news, the data is the xml files that contains information about Title, Time, Contents, Authors ...
  • For the posts from deep webs, they are mostly unstructured text
Wed, 03 May 2017 17:21:16 +0200
AI-Lab Logo

Organizing Committee

Awards Committee

  • Helge J. Bjorland – Senior Data Scientist, Mobile Analytics and CLM, Telenor
  • Erik Skarbø – Forecast Analyst, Mobile Analytics and CLM, Telenor
  • Arturo Amador – Big Data Group in Smart Digital, Telenor
  • Kenth Engø-Monsen – VP, Analytics and AI, Telenor Research
  • Ieva Martinkenaite – VP at Telenor Research, Head of Telenor-NTNU AI-Lab initiative
  • Helge Langseth – Professor, NTNU
  • Kerstin Bach – Ass. Professor, NTNU

Mentors

  • Hai Nguyen – Research Scientist at Telenor Research / Adj. Ass. Professor, NTNU
  • Massimiliano Ruocco – Research Scientist at Telenor Research / Adj. Ass. Professor, NTNU
  • Juwel Rana – Lead Scientist at Telenor Research / Ass. Professor, Linnaeus University)