Data Model
The Data Model, strongly inspired by CybOX, is an organization of the objects that may be monitored from a host-based or network-based perspective. Each object on can be identified by two dimensions: its actions and fields. When paired together, the three-tuple of (object, action, field)
acts like a coordinate, and describe what properties and state changes of the object can be captured by a sensor.
Summary
Object | Actions | Fields |
---|---|---|
authentication | error failure success |
ad_domain app_name auth_service auth_target decision_reason fqdn hostname method response_time target_ad_domain target_uid target_user target_user_role target_user_type uid user user_agent user_role user_type |
driver | load unload |
base_address fqdn hostname image_path md5_hash module_name pid sha1_hash sha256_hash signature_valid signer |
block delete deliver quarantine redirect |
action_reason attachment_mime_type attachment_name attachment_size date dest_address dest_ip dest_port from message_body message_links message_type return_address server_relay smtp_uid src_address src_domain src_ip src_port subject to |
|
file | acl_modify create delete modify read timestomp write |
company content creation_time extension file_name file_path fqdn gid group hostname image_path link_target md5_hash mime_type mode owner owner_uid pid ppid previous_creation_time sha1_hash sha256_hash signature_valid signer uid user |
flow | end message start |
application_protocol content dest_fqdn dest_hostname dest_ip dest_port end_time exe fqdn hostname image_path in_bytes network_direction out_bytes packet_count pid ppid proto_info src_fqdn src_hostname src_ip src_port start_time tcp_flags transport_protocol uid user |
http | get post put tunnel |
hostname http_version request_body_bytes request_body_content request_referrer requester_ip_address response_body_bytes response_body_content response_status_code url_domain url_full url_remainder url_scheme user_agent_device user_agent_full user_agent_name user_agent_version |
module | load unload |
base_address fqdn hostname image_path md5_hash module_name module_path pid sha1_hash sha256_hash signature_valid signer tid |
process | access create terminate |
access_level call_trace command_line current_working_directory env_vars exe fqdn guid hostname image_path integrity_level md5_hash parent_command_line parent_exe parent_guid parent_image_path pid ppid sha1_hash sha256_hash sid signature_valid signer target_address target_guid target_name target_pid uid user |
registry | add key_edit remove value_edit |
data fqdn hive hostname image_path key new_content pid type user value |
service | create delete pause start stop |
command_line exe fqdn hostname image_path name pid ppid uid user |
socket | bind close listen |
family image_path local_address local_path local_port pid protocol remote_address remote_port success |
thread | create remote_create suspend terminate |
hostname src_pid src_tid stack_base stack_limit start_address start_function start_module start_module_name tgt_pid tgt_tid uid user user_stack_base user_stack_limit |
user_session | lock login logout reconnect unlock |
dest_ip dest_port hostname login_id login_successful login_type src_ip src_port uid user |
What is the data model?
Objects
In the Data Model an object is much like an object in computer science. These are the items that data actually represent, such as hosts, files, connections, etc. Objects are the nouns of the Data Model vocabulary.
Actions
An action refers to a state change or event that happens on an object, such as an object’s creation, destruction, or modification. These are the verbs that describe that an object can do, and what can happen to an object. However, there are cases where sensors do not monitor actions in objects but merely scan for and check the presence of an object. Each action is represented in a coverage matrix (the 2D table). The actions are on the y-axis.
Fields
A field refers to the observable properties of an object. These properties may contain flags, identifiers, data elements, or even references to other objects. In terms of vocabulary, fields are like the adjectives. They describe properties about an object. A sensor monitors fields in the context of an object, and outputs these in some form of structured data. Once the data is ingested into a SIEM, the logs can be queried by forcing restrictions or patterns upon one or more objects, such as in an analytic. On the coverage matrix fields are on the x-axis.
Coverage
In order to gauge the usefulness of a sensor with respect to analytics, its output must be mapped into the Data Model. For each object that a sensor measures, it captures state. Some sensors periodically scan for objects, instead of monitoring for state changes. In these cases, state may be inferred by looking for changes in the properties of an object.
A summary of data model coverage is here.