bwiese + machinelearning   29

Protecting the protector: Hardening machine learning defenses against adversarial attacks – Microsoft Secure
Another effective approach we’ve found to add resilience against adversarial attacks is to use ensemble models. While individual models provide a prediction scoped to a particular area of expertise, we can treat those individual predictions as features to additional “ensemble” machine learning models, combining the results from our diverse set of “base classifiers” to create even stronger predictions that are more resilient to attacks.
ai  machinelearning  cybersecurity  microsoft 
august 2018 by bwiese
Thrip: Espionage Group Hits Satellite, Telecoms, and Defense Companies | Symantec Blogs
-targeted a satellite communications operator
-target was an organization involved in geospatial imaging and mapping
-targeted three different telecoms operators
-fourth target of interest, a defense contractor
supplychain  cybersecurity  apt  china  symantec  powershell  ai  machinelearning  analytics  threathunting  satellite 
july 2018 by bwiese
Symantec Helps Uncover Cyber Espionage Activity Targeting Satellite, Telecom, Geospatial Imaging and Defense Companies in the US and Southeast Asia
TAA leverages AI and advanced machine learning to comb through Symantec’s data lake of telemetry in order to spot patterns associated with targeted attacks. This technology essentially automates what previously took thousands of hours of analyst time and is available in Symantec’s Advanced Threat Protection (ATP) product. From an initial alert triggered by TAA in January 2018, Symantec researchers were able to follow a trail that enabled them to determine that the campaign originated from machines based in mainland China.

Customers of Symantec’s DeepSight Intelligence Managed Adversary and Threat Intelligence (MATI) service have received multiple reports on “ATG14” (also known as Thrip), which detail methods of detecting and thwarting activities of this adversary.
symantec  machinelearning  ai  cybersecurity  satellite  thrip 
july 2018 by bwiese
“Not If, but When” - Reflections on the OPM Breach
While that discovery was painful, it reflected the positive fact that OPM was, as an organization, looking forward beyond its peers, embracing the new paradigm of the future: the artificially intelligent, machine learning powered capabilities of Cylance’s products and services.

Following the initial internal suspicion of a data breach, OPM made the unprecedented decision to engage with Cylance immediately and to deploy us enterprise-wide and in prevention mode in a matter of four days. OPM knew that Cylance was the only solution to detect and mitigate the attack, and concluded that if they had us deployed before the barbarians approached the gate, they would have completely prevented this particular breach. Their brave leap of faith in us and our technology to close the gap in their armor - once the exclusive role of their internal IT team - will go down as one of the boldest events in modern cybersecurity history. It was in that effort that we locked shields with them and not only discovered countless compromises beyond the initial breach, but also cleaned up a very unclean environment formerly ‘protected’ by legacy antivirus. OPM took immediate and effective action, leveraging our partnership to assist them in turning the adversary aside and protecting against future attacks.
cylance  opm  cybersecurity  machinelearning  malware 
june 2018 by bwiese
Curbing the BEC Problem Using AI and Machine Learning - Security News - Trend Micro USA
mimic the decision-making process of a security researcher through a form of AI called Expert System. The engine will check if an email is coming from a dubious email provider, as well as the similarity of the sender’s domain to that of the target organization. It will also check if the sender is using a name of an executive at the recipient’s organization, among other factors. The engine’s “high-profile user” function applies additional scrutiny and correlation with commonly spoofed senders (such as executives at the target organization) and their real email addresses.
expertsystem  ai  machinelearning  bec  cybersecurity  email  workflow 
may 2018 by bwiese
Introducing Ember: An Open Source Classifier and Dataset | Endgame
Ember (Endgame Malware BEnchmark for Research) is an open source collection of 1.1 million portable executable file (PE file) sha256 hashes that were scanned by VirusTotal sometime in 2017. The dataset includes metadata, derived features from the PE files, and a benchmark model trained on those features. Importantly, ember does NOT include the files themselves so that we can avoid releasing others’ intellectual property. With this dataset, researchers can now quantify the effectiveness of new machine learning techniques against a well defined and openly available benchmark.
endgame  virustotal  machinelearning  classifier  malware  cybersecurity 
april 2018 by bwiese
The Difference Between AI, Machine Learning, and Deep Learning? | NVIDIA Blog
until recently neural networks were all but shunned by the AI research community. They had been around since the earliest days of AI, and had produced very little in the way of “intelligence.” The problem was even the most basic neural networks were very computationally intensive, it just wasn’t a practical approach. Still, a small heretical research group led by Geoffrey Hinton at the University of Toronto kept at it, finally parallelizing the algorithms for supercomputers to run and proving the concept, but it wasn’t until GPUs were deployed in the effort that the promise was realized.
machinelearning  ai  deeplearning  gpu  nvidia  neuralnetworks 
march 2018 by bwiese
Machine Learning and the Cloud: Disrupting Threat Detection and Prevention - YouTube
RSA 2016
@24 - compressed random model instead of PCA, builds in 8 minutes
@25 - red team attacking from Azure API, identify suspicious activity
@29 - false positive less than 1% after 28% challenge for MFA
@31:30 - only baseline normal past 45 day dataset for optimal fidelity
@45 - Lisa Brown locked out from RSA conf without MFA option
microsoft  machinelearning  cybersecurity  cloud  azure 
february 2018 by bwiese
Deciphering Malware’s use of TLS (without Decryption)
1) Flow Metadata - f inbound bytes, outbound bytes, inbound
packets, outbound packets; the source and destination ports;
and the total duration of the flow in second
2) Sequence of Packet Lengths and Times - sequence of packet lengths and packet inter-arrival times (SPLT) has been well studied [25], [39]. In our open source implementation, the SPLT elements are collected for the first 50 packets of a flow. Zero-length payloads (such as ACKs) and retransmissions are ignored. A Markov chain representation is used to model the SPLT data
3) Byte Distribution - the byte distribution can give information about the header-to-payload ratios, the composition of the application headers, and if any poorly implemented padding is added.
4) Unencrypted TLS Header Information - TLS version, the ordered list of offered ciphersuites,
and the list of supported TLS extensions are collected from
the client hello message. The selected ciphersuite and
selected TLS extensions are collected from the server
hello message. The server’s certificate is collected from the
certificate message. The client’s public key length is
collected from the client key exchange message, and
is the length of the RSA ciphertext or DH/ECDH public key,
depending on the ciphersuite. Similar to the sequence of packet
lengths and times, the sequence of record lengths, times, and
types is collected from TLS sessions
tls  malware  cisco  machinelearning  research 
february 2018 by bwiese
On Explainability in Machine Learning - MLSec Project
having a deterministic model is necessary prerequisite to being able to explain the judgements made by that model.

As a sidenote, Dr. Chuvakin's example classifier is just oriented towards finding "BAD" things. I think this is far too broad a task. A good model should be specific in what it's looking for. Narrow scopes help build trust in the models, and also make them conceptually easier to understand.
Now there are many features one could extract from a typical HTTP server log. Here's a partial list of ones that I can think of without much effort:
machinelearning  cybersecurity 
february 2018 by bwiese
Using machine learning for anomaly detection research
anomaly - “is the identification of items, events or observations which do not conform to an expected pattern or other items in a dataset.” Need to baseline for expected pattern.

getting the right data, cleaning and transforming it so that it was sufficient for his goals was the most time consuming part in the process

categorical classification to detect data points that were labeled as anomaly if they were crossing a threshold of relative change compared to the hour or day before. So according to his goal he defined conditions and engineered features that helped to model what’s normal and in relation to that what is an anomaly. In his case a RandomForestClassifier did the best jo
splunk  machinelearning  randomforest 
february 2018 by bwiese
Amazon Macie automates cloud data protection with machine learning | CSO Online
Last year, Amazon unveiled Amazon Inspector, its host-based application vulnerability assessment tool to monitor what is installed and configured on each virtual Instance. This year, it’s Amazon Macie, a security service designed to automatically discover and protect sensitive data stored in AWS.

service powered by machine learning that can automatically discover and classify your data stored in Amazon S3. But Macie doesn’t stop there, once your data has been classified by Macie, it assigns each data item a business value, and then continuously monitors the data in order to detect any suspicious activity based upon access pattern
aws  cybersecurity  machinelearning 
february 2018 by bwiese
Sources: Amazon quietly acquired AI security startup for around $20M | TechCrunch
Amazon Web Services appears to be ramping up its security chops. TechCrunch has learned that the e-commerce giant’s cloud services group quietly acquired cyber security firm

The San Diego-based startup, co-founded by a team that includes two former NSA employees, uses machine learning and artificial intelligence to analyze user behavior around a company’s key IP to try to identify and stop targeted attacks before valuable customer data can be swiped.

MACIE Analytics. It uses AI to monitor how a customer’s intellectual property is being accessed in real-time, assessing who is looking at, copying or moving particular documents, and where they are when they’re doing this, in order to identify suspicious patterns of behavior and flag potential data breaches before they’ve taken place. It bills the service as a way to combat the risk of insider attacks.
aws  cybersecurity  machinelearning 
february 2018 by bwiese
Gravitational Waves Collide with Cybersecurity: Using Machine Learning Inspired by Astrophysics | Ely Kahn | Pulse | LinkedIn
Signal processing techniques like matched filtering, whitening, seasonal decomposition, etc. used in LIGO’s analysis require further adaptation to be able to “learn” and adapt to varying noise and baseline characteristics.

Even the most optimal signal processing algorithm will produce false positives. The key in mitigating them is to a) use additional information and context to perform refined classification of detected outliers, and b) measure the rate of false positives in real data and use algorithms that account for it and adapt to its changes.

Sqrrl’s unique ability to collect and provide such information represented as a graph facilitates the application of multivariate statistical analysis and machine learning... use combination of Bayesian multivariate statistics, machine learning and graph algorithms.

cybersecurity analysis even in some ways harder than the search for GW signals. After all, the Universe is not malicious and is not trying to actively avoid being probed by us.

next level by connecting detectors via a contextual graph and combining their predictions using Bayesian statistics and graph algorithms. This approach allows us to “add up” sensitivities of different detectors without losing control over false positives.
cybersecurity  sqrrl  machinelearning  physics  gravitationalwaves 
january 2018 by bwiese
MLSec Project
Open-source projects and community for promotion of Data Science and Machine Learning in Information Security
cybersecurity  machinelearning  security 
january 2018 by bwiese
Detecting Encrypted Malware Traffic (Without Decryption)
As an overview, Figure 1 provides a simplified view of a TLS session. In TLS 1.2 [4], the majority of the interesting TLS handshake messages are unencrypted, and are displayed in red in Figure 1. All of the TLS-specific information that we use for classification comes from the ClientHello, which will also be accessible in TLS 1.3 [7].
cisco  machinelearning  tls  cybersecurity  analytics 
december 2017 by bwiese
EdgeRank Is Dead: Facebook's News Feed Algorithm Now Has Close To 100K Weight Factors
that there are as many as “100,000 individual weights in the model that produces News Feed.” The three original EdgeRank elements — Affinity, Weight and Time Decay — are still factors in News Feed ranking, but “other things are equally important,” he says.

In other words, the News Feed algorithm of today is much more sophisticated than just a couple years ago.
machinelearning  marketing  advertising  edgerank  facebook 
december 2013 by bwiese

Copy this bookmark: