top of page

Download our Comprehensive AI Playbook for the Mid-Market:

Playbook: Eliminating Breach Risks — 2025 Edition for midmarket organizations. Download to learn more

Suspect a Breach? 

!

Pondurance_Logo_R-10pxMargin_312px_REV-wordmark.png

Cyber101: What is a SIEM?s

Gartner_Resources-Tout_Exposure-Management (2).png
Michael DeNapoli
December 18, 2025

Quite a few folks have asked the question "So, what is a SIEM, and what does it do?" That's a question that deserves an answer for certain, as a SIEM is an important and critical component of well-rounded cybersecurity resilience.  Let's dive into the topic.


What is a siem?

A Security Information and Event Management (SIEM - pronounced either "seem" or "sim" as in simplify) platform is a set of technologies which take in data from many different sources, analyze that data to determine if what is being seen could indicate threat activity, and then alert when they finds something that does indicate threat activity.   Of course, how a SIEM does all of this is a bit more complex - and can vary from vendor to vendor - but the overall activity of a SIEM can be mapped out.  While the following is notably a bit of an over-simplification, SIEM solutions tend to follow similar methodologies to work:


Key Components of a SIEM Solution

First, a SIEM solution must ingest (receive copies of) information (telemetry) from security applications, operating systems, networking equipment, cloud systems, and other components of a security stack.  There are some standard methods that are routinely used to make that happen, but it does vary from tool to tool in the security world; so a SIEM must have the ability to ingest multiple forms of telemetry in order to do anything with it.  For example, Microsoft Windows uses a telemetry format - Windows Event Logs - that the SIEM will need to be able to take in and recognize to know what's going on in Windows systems (desktops, laptops, servers, etc.).  Cloud platforms like AWS and Azure also have their own logging and telemetry formats.  Linux - and a huge number of other things - use the very popular syslog format. 


Because of the very large number of log formats (and methods of getting those logs to the actual SIEM), modern SIEM solutions have integrations for the most popular security controls like endpoint defenses, networking equipment, operating systems, etc.  They also build tools to allow them to ingest common logging formats like syslog data and common networking information formats. Finally, SIEM vendors and partners often will build custom ingestion tools for their customers when necessary. The combination of these methods allows a SIEM to be able to take in data from a huge variety of software, platforms, networks, and other components of an organization's technology and infrastructure. 


Next, all that data has to go somewhere.  SIEM solutions also contain a database designed to file away every bit of that telemetry as it comes in.  You might think this would require a huge amount of storage space somewhere, and you wouldn't be wrong, but most modern SIEM solutions have ways to keep that to a manageable amount of disk space.  First, most log data is just small text files.  These don't take up much room and are highly compressible, so the actual amount of space they take up is a lot smaller overall than many other forms of data you'd find in a database.


On top of log files being small, a lot of the data ends up being duplicate entries that the SIEM already got from another security control, OS, or application. If the data is indeed duplicated, the SIEM only needs to store one copy of that information in the database.  This has to be done with caution, however, as accidentally removing something that looks like a duplicate but is actually not can have severe consequences.  Most SIEM solutions have the ability to retain the metadata (data about the data, like what tool sent that log and what time it was created) while compressing down the duplicate information. 


Taken together, this means that the amount of storage space your SIEM would need to hold all the necessary data to identify threat activity is manageable. Add in that most log data only needs to be in the "live" database for a relatively short period of time (usually about 90 days), and the SIEM can move stale data to a much less expensive storage system before things grow out of control.  That data is still there, and can be examined whenever necessary to do investigations, but it's not taking up expensive disk space unless it's needed for that purpose. 


How SIEM Transforms Threat Detection

So now you have an ingestion methodology set to get log and other information into the SIEM - removing duplicates but keeping everything necessary.  That's flowing into a database that's storing everything needed and recent.  But, how does the SIEM use all of that information to quickly find potential and actual threat activity happening in an organization?  That's where the analytics engine comes in, the heart of a SIEM. 

Each vendor manages analytics differently, so rather than speaking about any one product, we can talk to the general operation of analytics on security data:


First, the data is normalized.  This step converts all the various forms of data into a standardized format that can be used to quickly evaluate what the SIEM is seeing without having to try to sort out what one security tool calls something as opposed to what a different security tool calls it.  Deduplication (mentioned above) can also be performed during normalization - especially since everything is now in the same format, which makes true duplicates easier to spot. 


Next, the data is fed through a series of correlation rules.  This allows the SIEM to define patterns of activity, looking at groups of events that relate to each other instead of individual events alone.  Think of it like a security guard at a bank: Each person who comes in could be a bank robber, but they probably aren't.  Instead of examining each person who comes in individually, they look for patterns in behavior that are unexpected or unusual, then focus on that person.  The person walks around the public areas of the bank several times, they look up the ceiling several times, they watch what the tellers are doing behind the counters, etc.  Taken alone, the person could just be nervous or confused as to where they're supposed to go.  Taken together, that person may be checking out the security of the bank and looking for a way to avoid it.  Correlation rules allow the SIEM to see patterns in the incoming telemetry that align with known methods of cyber attack.  In or example, the person might try to figure out a password, open a web page that isn't usual for a work computer, or try to access a file share that isn't usually accessed by that person.  Individually, these could be simple user mistakes; but together they can indicate that a threat actor is probing for weaknesses in security controls. 


The Role of AI in SIEM Analytics

Correlation rules are only as good as the stuff that they know about, so a truly novel attack may not be recognized.  This is why the majority of SIEM solutions also utilize some form of artificial intelligence to see how seemingly unrelated patterns could indicate a new form of attack traffic. While this isn't the same as something like ChatGPT, this form of AI - Machine Learning - looks for data points that fall outside of expected values.  If the SIEM knows what "normal" looks like (due to feedback from security professionals over time), then it can use that statistical clustering to know what "abnormal" looks like - and further investigate those abnormalities. An abnormality doesn't instantly mean a threat actor is trying to break in, but it does mean that a deeper look at the telemetry is required in order to figure out if that's what is happening - even if the first pass of that data didn't see anything overtly malicious. 


The combination of correlation rules and Machine Learning create the ability for the SIEM to identify known and unknown, but aberrant, threat activities.  From that decision, the SIEM then creates alerts which let security teams know that something needs to be investigated.  Alerts commonly fall into a few key categories:

Informational: Perhaps someone attempted to log in, but failed.  If it happens often, then it needs to be investigated, but for now it's just a statistical blip that most likely user error.


Low priority: Several failed login attempts were followed by a successful login - but nothing else even remotely suspicious happened.  This could be threat activity, but it's much more likely to be someone logging in via a mobile phone and having trouble typing in a password.  The SIEM and the security team need to keep an eye on this, but not start shutting down systems. 


Moderate priority: Now our hypothetical user has started to do things like download executable files, or download large amounts of company data, or do other things that need to be investigated, but could still be legitimate activities. If it is indeed a false alarm, then the rules can be refined to avoid it in future - but it still definitely needs to be researched further to make sure it's not a threat action.  Security team personnel need to evaluate what happened and make the call as to if this is legitimate activity or an attack - and the SIEM will provide as much data as possible to those teams so they can do exactly that. 


High priority: Our user has now tried to install software on other machines, or add accounts to Active Directory, or encrypt large numbers of files - all direct indicators of malicious intent. At this point, devices need to be isolated and an investigation must happen immediately to contain the threat.  It's highly unlikely that this is a false alarm due to the actions that were taken; and now security personnel must react, contain, and clean up. 


Best Practices for Effective SIEM Implementation

To sum up, a SIEM is a data-processing solution that has three major components: an ingestion engine that allows the SIEM to absorb all different kinds of log data, a database that can store all that data after it is deduplicated and normalized, and an analytics engine that finds patterns which may - or absolutely do - indicate threat activity going on.   Put together, these components keep track of an ocean of data and turn it into information human security teams can directly act on.  This makes a SIEM solution an extremely valuable tool for any organization hoping to increase cybersecurity resilience. 

About the Author:


Michael DeNapoli is a seasoned Senior Solutions Architect with more than 25 years of experience in cybersecurity, solution architecture, and enterprise systems design. Throughout his career, he has led technical strategy, security architecture, and advanced solution development for organizations ranging from emerging security vendors to global enterprises. Michael’s expertise spans cybersecurity operations, cloud architecture, technical sales leadership, security posture management, and identity protection, with a proven track record of guiding clients through complex technology challenges. Today, he brings his deep industry knowledge to Pondurance as a Senior Solutions Architect, helping organizations strengthen their security foundations with clarity and confidence.


wave pattern background

Featured Posts

Cybersecurity 101: A Spectrum of Threats

November 25, 2025

November Cyber Threat Download™

November 18, 2025

When Cyber Threats Don’t Sleep: The Case for a 24/7 Security Operations (SOC)

November 17, 2025

bottom of page