Skip links

Acronis SCS AI4CP

Jan 26-27, 2022

Virtual Workshop of AI in Cybersecurity, Privacy, and Software Engineering 

Learn More & Register

The workshop is a special session within the 16th IEEE International Conference on SEMANTIC COMPUTING Conference 

Sponsored by Acronis and Acronis SCS 

This half-day workshop will feature researchers in AI and machine learning methods and techniques in the service of Cybersecurity, Privacy, and Software Engineering  

Organizing Committee: 

Joseph R. Barr, Acronis SCS, Arizona, USA

Sanjeev Solanki, Acronis, Singapore

Schedule of Speakers and their Abstracts 

Jan 26, 10 – 11 AM PST – Keynote 60 minutes 

Keynote Speaker: Moshe Vardi, Professor, Rice University, Houston, Texas, USA 

Title: Machine Learning and Logic: Fast and Slow Thinking 

Abstract: Computer science seems to be undergoing a paradigm shift. Much of earlier research was conducted in the framework of well-understood formal models. In contrast, some of the hottest trends today shun formal models and rely on massive data sets and machine learning. A canonical example of this change is the shift in AI from logic programming to deep learning.  I will argue that the correct metaphor for this development is not a paradigm shift but paradigm expansion. Just as General Relativity augments Newtonian Mechanics, rather than replace it — we went to the moon, after all, using Newtonian Mechanics — data-driven computing augments model-driven computing. In Artificial Intelligence, machine learning and logic correspond to the two modes of human thinking: fast thinking and slow thinking. The challenge today is to integrate the model-driven and data-driven paradigms. Finally, I will describe one approach to such an integration — making logic more quantitative. 

Jan 26,  3:30 – 5:30 PM PST- Afternoon Session 

Session Chair: Sanjeev Solanki, Acronis 


Short talk: 3:30 – 3:50 PM PST

Speaker: Marcus Sobel, Professor of Statistics, Temple University, Philadelphia, Pennsylvania, USA 

Title: Prediction in unbalanced grouped Correlated Binary Models Using SMOTE and Boltzmann machines 

Abstract: We observe data with a binary response.  It is assumed that the data can be partitioned into groups; each group belonging to one of types.  Groups of the same type are assumed to have similar structures.  The goal is to devise models to predict responses in each of the groups.   This problem arose from that of predicting among booked airline passengers which ones will show up.  The groups arose from passengers booking together.   Empirical data suggests the presence of intergroup but not intragroup correlation.   We formulate ‘energy’ functions that capture both the probabilities that each individual shows up and the joint probabilities that members of each group match each other.  We employ Boltzmann machines for both estimation and prediction. The data is extremely imbalanced; for this reason, we employ SMOTE to augment groups by synthetic minority oversampling.   Results are compared with standard logistic regression by dividing the data set into training and testing data.  


Short talk: 4:00 – 4:20 PM PST 

Speaker: Tyler Thatcher, Acronis SCS AI Lab, USA 

Title: Function Embedding 

Abstract: A tutorial-style function embedding of the Android kernel with LSTM networks. The tutorial will discuss various technical issues, goodness metrics, network calibration, dealing with “big data,” “out-of-vocabulary” problems, etc. 


Short talk:  4:30 – 4:50 PM PST

Speaker: Joseph R. Barr, Acronis SCS AI Lab, USA 

Title:  On Oversampling: Synthetic Samples in an Imbalanced Data, a comparative study & enhancements 

Abstract: This talk deals with data where records are tagged with a binary tag +1/-1. A common problem encountered by a statistical modeler is a dearth of examples in one of the two classes. This problem is common in classification, especially in anomaly detection, fraud prevention, disease diagnosis, etc.  


Short talk:  5:00 – 5:30 PM PST

Speaker: Vagelis Hristidis, Computer Science, University of California, Riverside, USA 

Title:  Progress and Limitations of AI Chatbot Technologies 

Abstract: AI Chatbots have recently become popular due to the widespread use of messaging services and the advancement of Natural Language Understanding. This tutorial gives an overview of the technologies that drive chatbots, including Intent Detection, Language Generation, and Slot Filing.  We also discuss the differences between chit-chat and goal-oriented chatbots – the former are trained on free-form chat logs, whereas the latter are defined manually to achieve a specific goal like booking a flight. We also provide an overview of commercial tools and platforms that can help create and deploy chatbots. Finally, we present the limitations and future work challenges in this area. 

Jan 27, 3:30 – 5:30 PM PST – Afternoon Session 

Session Chair Joseph R. Barr 


Short talk 3:30 – 3:50 PM PST

Speaker: Faisal Abu-Khzam, Professor, Department of Computer Science, Lebanese American University, Beirut, Lebanon 

Title:Network Data Anonymization 

Abstract: Network data arises from several domains such as social media and biological networks. Privacy becomes an issue when such data sets are published for statistical analysis or marketing purposes, among other objectives. Anonymization is a method of choice for data security in this case. From a theoretical perspective, network data anonymization is often tackled using degree-based data anonymization, which is mainly based on the “hide in the crowd” principle. This talk discusses degree-based anonymization and its limitations and proposes more effective methods based on graph embedding and dimensionality reduction. The novel techniques allow for the practical application of some data mining tools on anonymized data without revealing hidden information.  


Short talk 4:00 – 4:20 PM PST 

Speaker:  Austin Richards, Acronis SCS AI Lab, USA 

Title: The Evolution of AI and machine learning in cybersecurity 

Abstract: Artificial intelligence and machine learning has continuously grown throughout many different markets; however, one issue can prevail throughout: bias. This session will dive into the techniques that can be leveraged to reduce bias in your own ML implementations. Topics include problem identification, data exploration, applied statistics, diversity, and testing. 


Short talk 4:30 – 4:50 PM PST

Speaker: Sanjeev Solanki, Acronis, Singapore 

Title: Secure Multi-Party Computation Technique for Privacy-Preserving Collaborative Computing 

Abstract: Sharing sensitive data as plaintext is connected with high risks and is often restricted by regulations. Currently, existing cyber security products do not help in this niche. A platform based on Secure Multi-Party Computation was created where multiple parties can work together to securely join their data and jointly analyze data, but no party can obtain sensitive information from other parties. This partially distributed system supports a wide range of secure analysis tools, from simple descriptive statistics to complex machine learning-based analytics. In cases where it is necessary to visualize the data, a differential privacy technique is utilized, which preserves correlation in data. 


Short talk 5:00 – 5:20 PM PST 

Speaker: Peter Shaw, Professor, Oujiang Laboratory, Wenzhou Institute, University of Chinese Academy of Sciences, Zhejiang, China 

Title: Correlation-Clustering Based Anomaly Detection 

Abstract: Anomaly detection has been studied and modeled using various methods, including clustering. This paper investigates the utility of using exact correlation clustering modeled via the Cluster Editing problem for accurate anomaly detection.  We use Twitter data for this purpose to study the feasibility of analyzing network anomalies. The data pipeline produces a sequence of graphs from text to represent relations between entities (in this case, tweets.)  Each graph constitutes a single data element, thus a vertex in a final graph on which the clustering is performed. 

Learn More & Register