The Acronis SCS Machine Intelligence Lab is a research organization dedicated to developing methodology and algorithms to identify, manage, and mitigate cyber risks. In collaboration with academic and other research organizations, we make our findings public by publishing our results.
Whitepaper: Static Code Analysis: FICO-like rating the vulnerabilities of source code
Dependence on information technology extends to every facet of life. One is hard-pressed to point on an activity that does not fundamentally involve [sic] ‘cyber technology.’ Technology is uniquely responsible for explosive growth in productivity and wealth; technology has allowed for convenience enabling the lifestyle we came to rely on and expect. Technology is used to manage health, cure disease, for manufacturing & commerce, manage resources, telecommunication, etc. With that, however, comes risk, which has many forms ranging from the benign, one resulting in mere inconvenience, to the serious with potential long-term consequences to our health, wealth, and national security. In recent years the impacts of those risks have been amply demonstrated, receiving the attention of policymakers, the media, and business, but to a lesser extent, the public. This accessible essay is written for a mixed audience, including the general public, to increase awareness of cybersecurity.
Whitepaper: Upsampling, a comparative study with new ideas
We give a brief tour of sampling techniques not uncommon in dealing with imbalanced data having binary tags. We mention the rather ‘naive’ upsampling techniques, classical bootstrap (B. Efron, 1979) and a more recent technique called SMOTE (Chawla, et al, 2002). We demonstrate those techniques using features derived from the Android OS source
code with tags derived from the common vulnerabilities & exposures public database (CVE, Mitre) which contains known software bugs in widely-used applications. Evidently, the data which contains approximately 0.5% positive tags with remaining approximately 99.5% negative tags is highly, in fact extremely imbalanced. A new idea is to synthesize samples from a minority class using an idea from the Bayesian domain. Forthcoming is an empirical analysis of the procedure.
Whitepaper: On the Equivalence of Likelihood & Cross Entropy
We demonstrate the duality of a general version of the Fisher’s likelihood principle, especially maximum likelihood and Shannon’s entropy, especially minimum cross entropy. We show that in a certain sense those two seemingly distinct concepts converge and are one of the same.