Microsoft Catalog Files and Digital Signatures decoded

TL;DR: Parse and print .cat files: parsemscat

Introduction

Günther Deschner and myself are looking into the new Microsoft Printing Protocol [MS-PAR]. Printing always means you have to deal with drivers. Microsoft package-aware v3 print drivers and v4 print drivers contain Microsoft Catalog files.

A Catalog file (.cat) is a digitally-signed file. To be more precise it is a PKCS7 certificate with embedded data. Before I started to look into the problem understanding them I’ve searched the web, if someone already decoded them. I found a post by Richard Hughes: Building a better catalog file. Richard described some of the things we already discovered and some new details. It looks like he gave up when it came down to understand the embedded data and write an ASN.1 description for it. I started to decode the myth of Catalog files the last two weeks and created a tool for parsing them and printing what they contain, in human readable form.

Details

The embedded data in the PKCS7 signature of a Microsoft Catalog is a Certificate Trust List (CTL). Nikos Mavrogiannopoulos taught me ASN.1 and helped to create an ASN.1 description for the CTL. With this description I was able to start parsing Catalog files.

CATALOG {}
DEFINITIONS IMPLICIT TAGS ::=

BEGIN

-- CATALOG_NAME_VALUE
CatalogNameValue ::= SEQUENCE {
    name       BMPString, -- UCS2-BE
    flags      INTEGER,
    value      OCTET STRING -- UCS2-LE
}

...

END

mscat.asn

The PKCS7 part of the .cat-file is the signature for the CTL. Nikos implemented support to get the embedded raw data from the PKCS7 Signature with GnuTLS. It is also possible to verify the signature using GnuTLS now!
The CTL includes members and attributes. A member holds information about file name included in the driver package, OS attributes and often a hash for the content of the file name, either SHA1 or SHA256. I’ve written abstracted function so it is possible to create a library and a simple command line tool called dumpmscat.

Here is an example of the output:

CATALOG MEMBER COUNT=1
CATALOG MEMBER
  CHECKSUM: E5221540DC4B974F54DB4E390BFF4132399C8037

  FILE: sambap1000.inf, FLAGS=0x10010001
  OSATTR: 2:6.0,2:6.1,2:6.4, FLAGS=0x10010001
  MAC: SHA1, DIGEST: E5221540DC4B974F54DB4E39BFF4132399C8037

In addition the CTL has normally a list of attributes. In those attributes are normally OS Flags, Version information and Hardware IDs.

CATALOG ATTRIBUTE COUNT=2
  NAME=OS, FLAGS=0x10010001, VALUE=VistaX86,7X86,10X86
  NAME=HWID1, FLAGS=0x10010001, VALUE=usb\\vid_0ff0&pid_ff00&mi_01

Currently the projects only has a command line tool called: dumpmscat. And it can only print the CTL for now. I plan to add options to verify the signature, dump only parts etc. When this is done I will create a library so it can easily be consumed by other software. If someone is interested and wants to contribute. Something like signtool.exe would be nice to have.

Understanding Winbind

I recently fixed a bug resolving Domain Local groups in Winbind. I was asked how to reproduce it with a more complex setup, so I had to dig through the Winbind code to understand everything in more detail. I have documented my findings here, in order to retain what I’ve learned and to help others understand how Winbind works.

The Setup

We have a forest with two AD domains: level1.discworld.site and level2.discworld.site. The two domains have a two way trust. User accounts are created on LEVEL1, groups and machine accounts are on LEVEL2. We have a Linux machine named ‘linux’, with Winbind joined to LEVEL2. I will describe everything from the perspective of Winbind, so LEVEL2 inside of Winbind is also referred to as ‘own domain’.

Users:
LEVEL1\ab
LEVEL1\asn
LEVEL1\gd

Groups:
LEVEL2\samba (members: LEVEL1\ab, LEVEL2\asn, LEVEL2\gd)

Machine Accounts:
LEVEL2\linux$

Winbind Startup

Lets assume we have successfully joined the machine ‘linux’ to LEVEL2 and then start Winbind. There is a parent Winbind process which delegates work to Winbind children. The parent forks a child for each logical domain, so in this setup there are 4 domain child processes: LEVEL1, LEVEL2, BUILTIN and SAMBA (local SAM). LEVEL1 and LEVEL2 will connect to their corresponding AD domain controllers.

Querying information from AD domain controllers

If we want to obtain information about users or groups we have to query a Domain Controller. There are two ways to lookup this information. If the corresponding user is not logged in, then we create queries for this information using the machine account. The machine account has limited permissions to query information, especially on Domain Controllers of trusted domains, so most of the time this information is incomplete (or may be incorrect), as we cannot provide more than what the AD domain controllers allow us to read. Often these queries are expensive, so caching is important to reduce the load on the domain controllers. Correct information about e.g. group memberships for a user is obtained when we authenticate as this user. The domain controller will then collect the information with the token of the user and send it to Winbind (netr_LogonSamLogon or Kerberos PAC). In Winbind we cache this information. We have an issue here. If you get the information about a user with the machine account and cache it. Then authenticate as the user and get the information again, we may return the information from the cache. If the groups of a user change while he is logged in, we will not get an update until the user logs in again.

Authentication

If Winbind authenticates a user there are normally two routes. It could do a normal netr_LogonSamLogon or a samlogon with kerberos. If you want to authenticate a user using samlogon you can do this using ‘wbinfo -a ‘, with kerberos ‘wbinfo -K ‘.

So if a login is initiated, the main Winbind process gets a pam authentication request. The authentication request is normally sent to the DC Winbind has been joined to. So if LEVEL1+asn is trying to login, the Winbind child handling LEVEL1 will do a LogonSamLogon using the Netlogon PIPE to the domain controller. The domain controller is responsible for collecting all required information about the user and will return all information about group memberships in the info3 structure of the LogonSamLogon response.
If Kerberos is involved the Winbind child handling LEVEL1 will authenticate the user talking to the KDC of the domain controller. All information will be stored in the PAC (Privilege Attribute Certificate) of the Kerberos ticket (which is similar to the info3 structure in the LogonSamLogon response).

id and getent

Information retrieval for ‘id’ or ‘getent’ will take a special route if we are only able to gather it using the machine account. The results collected with the machine account can differ from the results obtained during user authentication! Normally the information sent back from the Domain Controller during authentication is much more detailed and complete. It is possible the results differ between querying information as a user and as a machine account. With the limited resources of a machine account we can only try to get the basic information from the domain controller that the user is a member of. We will not contact trusted domains as enumeration is expensive and often not allowed with a machine account.

Lets look which functions will be called in Winbind if a user runs the command ‘id LEVEL1+asn’. For ‘id’ to be working we assume that nsswitch has been correctly configured to talk to Winbind. There is a libnss_winbind.so module which talks to the parent Winbind process over a unix pipe. The parent Winbind process handles all nsswitch function calls (POSIX functions) coming over the pipe asynchronously. We do not discuss id mapping here, it will get too complex, we will just look on the flow of information.

‘id LEVEL1+asn’ will calls several POSIX functions which are sent over the UNIX pipe to the main Winbind process. These functions are getpwnam, getgrgid and getgroups. We assume that we have cold caches and need to handle these requests using machine account privileges.

getpwnam

The first thing ‘id’ calls is the getpwnam function. This will retrieve basic information about the user like the primary group id, the home directory and shell. The main Winbind process sends three queries to the LEVEL1 child for this: lookupname to get the SID of the user, a second lookupname to translate the SID to the username for verification, and finally a QueryUser call to get the basic information (primary gid, …). The first lookupname is a lsa_LookupNames call to the DC’s LSA server. The second is a LookupSids call to translate the SID to a name again. The QueryUser command is a LDAP query. Normally we always try to get the information with the fastest method and fall back to slower mechanisms if that fails.
After we collect all information and also store them in the cache (the child processes are responsible for the caches) we return the information to nsswitch. Now the ‘id’ command needs to get the name for the primary group and calls ‘getgrgid 1000000’ (1000000 being the gid of the primary group).

getgrgid 1000000

The parent Winbind will connect to the LEVEL1 domain child and call lookupname. LEVEL1 will then connect over RPC to the LSA pipe and call LookupSids3 to translate the SID to the name (the idmapping knows about the SID for the gid, the details are left out here).
As we have the important user information it is time to ask for additional group memberships of the user. This results in the following call:

getgroups LEVEL1+asn

The request is received by the main Winbind process, which needs to resolve the groups on three domains. It is always the same even if the machine is joined to a different domain the user is a member of.

a) The domain the user is a member of (LEVEL1)
b) The local SAM Authority (SAMBA)
c) The BUILTIN domain

We will need the SID of the user first so we ask the LEVEL1 child to resolve the name to a SID. Then for each of the domains we ask the corresponding child to do a LookupUserGroups. The LEVEL1 child will do a LDAP search to get a list of SIDs the user is a member of. Then it will talk to the DC LSA Server and call LookupSids3() to translate the SID into a name for each group. The information is sent back to the parent which will ask the local domain (SAMBA) if there are any aliases that the user is a member of. It will send a LookupUserAliases to the SAMBA child which will lookup the information using pdb. The final step is to talk to the BUILTIN domain for user aliases.

After all of the above POSIX calls were successful id will print the information it collected.

If you login as the user using kerberos first, then the information about the user are cached by the domain child serving the users domain. If you now call getpwnam then query will be filled with the information stored in the cache. Only the SID to name translation requires a LookupSids3 LSA call to the DC if it is not cached yet. The same for get getgrgid or getgroups call. We already got the information from the DC in the PAC which groups the user is a member of. We just need to translate the SIDs to names.

To be continued …

Documenting the Source

As you maybe know I have a new job since last December and I’m working on
Samba4 now. Samba4 is a monster so I’ve asked for some simple tasks to get
started. The task was to migrate some code to a new Samba library called
tsocket. The problem was I didn’t know what to do and how. Some functions
of the API were documented but not all. So I had to guess from the names
what the function is doing and read the code to understand it. Then I’ve
started to work with the interface and I had to look again the the code to
find out possible return values. In the end I spent a lot of time jumping
through the source code to find out the return values for the functions.

If the API would be completely documented I could get my work done a lot of
faster so I simply started to document it cause I had to understand it anyway.
I’ve decided to write the documentation with doxygen and put it in the header
file, so that people who use the PAI always have the documentation with them.

After I finished it, started to work on the source code again and got some
things working as I was able to understand the API of the library. Then I
crossed the next undocumented API of a library. Ok, it wasn’t undocumented it
had a text file describing everything but having doxygen documentation is much
nicer than a text file. So I’ve started to document talloc from Samba4 with
doxygen.

The talloc API uses macros for a lot of things to make debugging easier or
to hide things you’re doing from the user. However if you document a macro
than normally you want that it looks like a function. To be able to do that
with doxygen you have to use a little trick. As doxygen has a C preprocessor
built in you can create a define for a doxygen mode. That’s what I’ve done in
the config file and all you need to do in the source code is to use it with
#ifdef.

#ifdef DOXYGEN
/**
* @brief Create a new talloc context.
*
* The talloc() macro is the core of the talloc library. It takes a memory
* context and a type, and returns a pointer to a new area of memory of the
* given type.
*
* The returned pointer is itself a talloc context, so you can use it as the
* context argument to more calls to talloc if you wish.
*
* The returned pointer is a "child" of the supplied context. This means that if
* you talloc_free() the context then the new child disappears as well.
* Alternatively you can free just the child.
*
* @param[in] ctx A talloc context to create a new reference on or NULL to
* create a new top level context.
*
* @param[in] type The type of memory to allocate.
*
* @return A type casted talloc context or NULL on error.
*
* @code
* unsigned int *a, *b;
*
* a = talloc(NULL, unsigned int);
* b = talloc(a, unsigned int);
* @endcode
*
* @see talloc_zero
* @see talloc_array
* @see talloc_steal
* @see talloc_free
*/
void *talloc(const void *ctx, #type);
#else
#define talloc(ctx, type) (type *)talloc_named_const(ctx, sizeof(type), #type)
void *_talloc(const void *context, size_t size);
#endif

So start to document your API. What you get well be something like this and other will love it!

Automatic testing of PAM modules

Last week at the SambaXP conference I had a discussion with Günther Deschner about the testing of PAM modules. What we want to do is automatic testing. To achieve this in the Samba build farm you need a separate “pam.d” config directory for testing. You should be able to change the config and mess it up without getting locked out.

I’ve introduced a new function to PAM called pam_start_test() which takes and additional argument where you can specify the config directory. After this I’ve changed the call in pamtester and added a commandline option for the config directory. To do automatic testing I’ve added another commandline option to specify the password to use for authentication.

gladiac@maximegalon:~> pamtester -v -C/tmp/pam.d -Psecret login csync authenticate
pamtester: invoking pam_start(login, csync, ...)
pamtester: performing operation - authenticate
pamtester: successfully authenticated

You can find the patches here.

Roaming Home Directories for Linux

An interesting feature of Active Directory is Roaming Profiles. You can login on different workstations and you have all you data with you. If you use a Notebook you have the same and the ability to work offline. As soon as you’re connected to you network again the data will be automatically synchronized again and you have a backup of your data.

Now the time has come to introduce Roaming Home Directories for Linux. Yesterday I’ve released a new version of csync and the first version of pam_csync. With both components you’re able to use an Active Directory environment to share your data between workstation and notebooks and work offline.

Currently only the SMB protocol is supported but I will write more plugins for other protocols in future. I have sftp and rsync (if doable) in mind. So you will be able to use it at home with you’re small NAS or in a Linux only company environment.

This is not the only use case. If you have a USB disk with your music collection. You can can attach it to PC1 and synchronize it with your local collection. Go to the next computer and synchronize it there again. As csync is a bidirectional file synchronizer the collection on PC1 and PC2 will be the same.

http://www.csync.org/

csync 0.42.0 alpha1

I’ve released the first alpha version of csync. csync is a client only bidirectional file synchronizer. You can use csync for different things. The intention is to provide Roaming Home Directories for Linux but you can use it to synchronize your music collection or create a backup of a directory. This is *not* intended for production environments and is designed for testing purposes only. This version is fully functional and you can sync two local directories or a local directory with a samba share.

More at http://www.csync.org/