Part 2: The Datatable
In the previous article, we talked about the different tables contained in the NTDS database of Active Directory. Among these, the most interesting and important table in terms of content is called the datatable. It contains user data, group data, machine data, trust relationships, etc. It is the information contained in this table that is returned in LDAP queries.
The objective of this article is therefore to present this data structure, so that in our upcoming articles, we can go into more detail on the content.
Each Active Directory object has a row in the datatable, and a row can have several thousand columns (attributes or properties). This very large number comes from the schema extension mechanism, which allows adding columns for specific objects following the installation of a new Active Directory-compatible service or a schema version upgrade. For example, when an Exchange server is installed in a company, new specific columns will be added to the datatable and will be used only by Exchange objects. The fields of these columns will be empty for other objects.
The large number of columns is one of the reasons that forced Microsoft not to use classic relational databases. Indeed, databases like PosgresSql (1) are limited to 1600 columns and 1024 for SQLServer (2).
Note: In the context of our Active Directory audits through our IAMBuster solution, the number of columns in the datatable oscillates between 1400 for functional level 2008 R2 and 3500 for functional level 2016. Indeed, Microsoft updates the schema with each new version of Windows Server (3).
Note : In the context of our Active Directory audits through our IAMBuster solution, the number of columns in the datatable oscillates between 1400 for functional level 2008 R2 and 3500 for functional level 2016. Indeed, Microsoft updates the schema with each new version of Windows Server (3).
The names of the datatable columns are not understandable by a human being. They are formed as follows:
A prefix “ATT” + a letter (indicating the data type) + a numerical identifier (the attribute ID)
The letter indicates the type of data stored (4):
- j: 4-byte integer (JET_coltyp.Long)
- q / l: 8-byte signed integer (usually timestamps) (JET_coltyp.Currency)
- m: string of characters (JET_coltyp.LongText)
- k / r: Binary (JET_coltyp.LongBinary) etc.
To be able to understand what data is stored in each column, the process is not trivial but easily automatable. This process had already been documented about ten years ago by researchers from ANSSI (5).
To conduct our attribute search, it is necessary to know the following prerequisites obtained during an initial manual search:
- The ATTm131532 column of the datatable contains the LDAP name of the object
- The ATTc131102 column of the datatable contains the ID of the attribute
Here are the different steps:
1- We search in the MSysObjects table for the columns present in the NTDS datatable. Ex: ATTm590045
2- We then search in the datatable for the attribute ID equal to 590045.
3- After identifying the object (the row) that contains ID 590045, we look in the ATTm131532 column for the corresponding LDAP name.
Here are some examples of interesting column names:
Here are some examples of interesting column names:
|Column Name||Ldap Attribute||Description|
|ATTm590045||sAMAccountName||User name, machine name, domain name (trust), etc.|
|ATTm13||description||Description of the object|
|ATTj589836||badPwdCount||Number of incorrect password attempts|
|ATTm590187||operatingSystem||Name of the operating system (e.g., Windows 10 Professional) when it’s a machine account|
|ATTm590188||operatingSystemVersion||Version of the operating system (e.g., 10.0 (19042)) when it’s a machine account|
|ATTr589970||objectSID||The object’s identifier (SID)|
The table below helps to understand why we refer to it as a “sparse” table.
Thus, as illustrated, the datatable should be imagined as a big Excel file with a lot of empty cells. The User Account object has a name (sAMAccountName), a login counter (logonCount), but does not have an operating system, which is dedicated to machine accounts. The Domain object has no other attribute here except for the default minimum password length, which is defined only at its level.
The list of correspondences between the datatable column name and the LDAP name has been published on our GitHub for three different functional domain levels:
- 2008 R2
- 2012 R2
A script allowing to regenerate this list for a given NTDS has also been provided. Depending on the functional level of the domain, the number of columns differs. This is available at the following address: https://github.com/xmco/ntds_extract/blob/main/Part-2-La-Datatable/extract_ntds_columns_name.py
Once you have established the column names you want to extract, it is possible to directly access the data.
Example available on our GitHub developed with the help of the Dissect python module (6)
However, some types of attributes such as dates, SIDs, or encrypted data will require special processing to be exploited. Among the encrypted data, we have the cryptographic hashes of passwords (hashNT and hashLM), extremely valuable information, which we will discuss in detail in a future article.
Note : The column names in the datatable do not change from one Active Directory to another (except for columns added by third-party software). Thus, the column ATTm590045 will always contain the sAMAccountName attribute.
Translated by Florian Duthu