Deciphering Xcode’s index

At work we’re having to wait an inordinate amount of time for Xcode to finish indexing our rather large Swift project. I’ve consequently spent a lot of time over the past few weeks digging into the internals of indexing. This is more or less a brain dump of what I’ve discovered thus far.

File structure #

Xcode’s index is broken into a number of files, located in {project derived data}/Index/{build config}/{platform}/{project}.xcindex/:

db.xcindexdb
db.xcindexdb-shm
db.xcindexdb-wal
db.xcindexdb.strings-cmp
db.xcindexdb.strings-dir
db.xcindexdb.strings-file
db.xcindexdb.strings-moduleurl
db.xcindexdb.strings-res
db.xcindexdb.strings-sym

SQLite files #

Strings files #

These files are collections of strings separated by 0x00 (with a leading and trailing 0x00). These strings are referenced by rows in the SQLite database. Why they aren’t in the SQLite database is anyone’s guess. ?

The references in the database are integer offsets to the start of the string in these files (presumably for performance reasons).

SQLite database schema #

XcodeIndex.png

Open up the db.xcindexdb file in the SQLite command line shell and get the list of tables:

$ sqlite3 db.xcindexdb
sqlite> .tables
context    group_     language   reference  unit     
file       kind       provider   symbol

Let’s dig into each of these. Keep in mind that the files, directories, and other strings referenced in the database are stored externally, in the aforementioned strings files. (Again, why they aren’t in the db is anyone’s guess.)

Note: Your output may differ from mine. I’m using the index files for a simple Mac app project with a couple of source files, to keep it simple.

Also, the comments in the SQL table definitions are mine.

kind #

The kind table appears to hold the various possible token types.

sqlite> .schema kind
CREATE TABLE kind(
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    identifier TEXT NOT NULL
);
sqlite> SELECT * FROM kind;
id|identifier
1|Xcode.SourceCodeSymbolKind.IBOutlet
2|Xcode.SourceCodeSymbolKind.GlobalVariable
3|Xcode.SourceCodeSymbolKind.Global
4|Xcode.SourceCodeSymbolKind.ToDo
5|Xcode.SourceCodeSymbolKind.Callable
6|Xcode.SourceCodeSymbolKind.StaticProperty
7|Xcode.SourceCodeSymbolKind.BuiltinType
8|Xcode.SourceCodeSymbolKind.FunctionTemplate
9|Xcode.SourceCodeSymbolKind.StaticMethod
10|Xcode.SourceCodeSymbolKind.Member
...

provider #

The provider table appears to hold the various sources of index data (e.g. Clang, SourceKit, etc).

sqlite> .schema provider
CREATE TABLE provider (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    identifier TEXT NOT NULL,
    version TEXT NOT NULL
);
sqlite> SELECT * FROM provider;
id|identifier|version
1|Xcode.IDEFoundation.Index.DataSource.Unknown|1.0
2|Xcode.Swift.Index.DataSource|3
3|Xcode.IDEFoundation.Index.DataSource.auxiliaryFiles|1.1
4|Xcode.IDEFoundation.Index.DataSource.clang-module|1
5|Xcode.IDEFoundation.Index.DataSource.clang|1

language #

language is the list of programming languages about which Xcode knows.

sqlite> .schema language
CREATE TABLE language (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    identifier TEXT NOT NULL
);
sqlite> SELECT * FROM language;
id|identifier
1|Xcode.SourceCodeLanguage.C
2|Xcode.SourceCodeLanguage.CSS
3|Xcode.SourceCodeLanguage.JSON
4|Xcode.SourceCodeLanguage.Metal
5|Xcode.SourceCodeLanguage.BourneShellScript
6|Xcode.SourceCodeLanguage.XML
7|Xcode.SourceCodeLanguage.OpenCL
8|Xcode.SourceCodeLanguage.XcodeStrings
9|Xcode.SourceCodeLanguage.C-Plus-Plus
10|Xcode.SourceCodeLanguage.Objective-C
...

file #

file, unsurprisingly, represents a file. Again, remember that the filenames and directories are offsets into the strings files.

sqlite> .schema file
CREATE TABLE file (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    -- offset in db.xcindexdb.strings-file
    lowercaseFilename INTEGER NOT NULL,
    -- offset in db.xcindexdb.strings-file
    filename INTEGER NOT NULL,
    -- offset in db.xcindexdb.strings-dir
    directory INTEGER NOT NULL,
    -- offset in db.xcindexdb.strings-moduleurl
    --    0 = project's module?
    moduleurl INTEGER NOT NULL,
    inProject INTEGER NOT NULL
);
CREATE INDEX file_lowercaseFilename_index ON file (lowercaseFilename);

group_ #

sqlite> .schema group_
CREATE TABLE group_ (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    file INTEGER NOT NULL,     -- foreign key to `file` table
    signature TEXT NOT NULL,
    signature_inBody TEXT NOT NULL,
    provider INTEGER NOT NULL  -- foreign key to `provider` table
);
CREATE INDEX group_index ON group_ (file, signature);

unit #

unit represents a compilation unit??

sqlite> .schema unit
CREATE TABLE unit (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    file INTEGER NOT NULL,  -- foreign key to `file` table
    target TEXT NOT NULL,   -- path to Xcode target
    provider INTEGER NOT NULL,  -- foreign key to `provider` table
    pchFile INTEGER
);
CREATE UNIQUE INDEX unit_index ON unit (file, target);
CREATE INDEX unit_target_index ON unit (target);
CREATE INDEX unit_provider_index ON unit (provider);

The target column has (at least) three possible values:

context #

sqlite> .schema context
CREATE TABLE context (
    unit INTEGER NOT NULL,    -- foreign key to `unit` table
    group_ INTEGER NOT NULL,  -- foreign key to `group_` table
    includer INTEGER,
    -- modified time relative to 00:00:00 UTC on 1 January 2001
    --     (NSDate's reference date)
    modified REAL,
    spliced INTEGER DEFAULT 0
);
CREATE UNIQUE INDEX context_index ON context (unit, group_);
CREATE INDEX context_group_index ON context (group_);
CREATE INDEX context_includer_index ON context (includer);

symbol #

symbol contains the symbols in your code.

sqlite> .schema symbol
CREATE TABLE symbol (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    -- offset in db.xcindexdb.strings-sym
    spelling INTEGER NOT NULL,
    -- offset in db.xcindexdb.strings-sym
    lowercaseSpelling INTEGER NOT NULL,
    kind INTEGER,  -- foreign key to `kind` table
    role INTEGER NOT NULL,
    language INTEGER,  -- foreign key to `language` table
    resolution INTEGER,
    group_ INTEGER NOT NULL,  -- foreign key to `group_` table
    lineNumber INTEGER,  -- starts at 1
    column INTEGER,      -- starts at 1
    locator TEXT,
    container INTEGER,
    completionString INTEGER
);
CREATE INDEX symbol_lowercaseSpelling_index ON symbol (lowercaseSpelling);
CREATE INDEX symbol_resolution_index ON symbol (resolution);
CREATE INDEX symbol_kind_index ON symbol (kind);
CREATE INDEX symbol_group_index ON symbol (group_);
CREATE INDEX symbol_container_index ON symbol (container);

reference #

reference contains the references in your code to symbols.

sqlite> .schema reference
CREATE TABLE reference (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    -- offset in db.xcindexdb.strings-sym
    spelling INTEGER NOT NULL,
    -- offset in db.xcindexdb.strings-sym
    lowercaseSpelling INTEGER NOT NULL,
    kind INTEGER,  -- foreign key to `kind` table
    role INTEGER NOT NULL,
    language INTEGER,  -- foreign key to `language` table
    resolution INTEGER,
    group_ INTEGER NOT NULL,  -- foreign key to `group_` table
    lineNumber INTEGER,  -- starts at 1
    column INTEGER,      -- starts at 1
    locator TEXT,
    container INTEGER,
    receiver INTEGER
);
CREATE INDEX reference_lowercaseSpelling_index ON reference (lowercaseSpelling);
CREATE INDEX reference_resolution_index ON reference (resolution);
CREATE INDEX reference_group_index ON reference (group_);
CREATE INDEX reference_container_index ON reference (container);

Examples #

OK, now that we’ve got the basic structure down, let’s do some lookups. We’re going to be working from a basic macOS project with the following code:

// Logger.swift
final class Logger {
    func log(message: Message) {
        print(message)
    }

    func dump() -> String {
        return "Hello, world!"
    }
}

// Message.swift
struct Message: CustomStringConvertible {
    let content: String
    let file: String
    let line: Int

    var description: String {
        return "\(file):\(line) -- \(content)"
    }
}

// main.swift
let logger = Logger()
logger.log(message: Message(content: "Hello, world!", file: #file, line: #line))

Challenge #

Find all usages of the Logger class in our code.

Solution #

Step 1 #

Find the location of the string “Logger” in the .strings-sym file.

$ strings -a -t d db.xcindexdb.strings-sym | grep "^[0-9]* Logger$"
8 Logger

Step 2 #

Find its record in the reference table.

$ sqlite3 db.xcindexdb
sqlite> SELECT * FROM reference
   ...> WHERE spelling = 8;
id|spelling|lowercaseSpelling|kind|role|language|resolution|group_|lineNumber|column|locator|container|completionString
11719|8|1|5|4|35|93|1355|9|14|||

This gives us all the places Logger is referenced (only one, currently).

Step 3 #

To find the files corresponding to these references, join across group_ to file:

sqlite> SELECT f.id,f.filename,f.directory
   ...> FROM file f
   ...> INNER JOIN group_ g ON (g.file = f.id)
   ...> INNER JOIN reference r ON (r.group_ = g.id)
   ...> WHERE r.spelling = 8;
id|filename|directory
1|1|1

We can now jump to offset 1 in db.xcindexdb.strings-file to get the filename where Logger is referenced (main.swift) and jump to offset 1 in db.xcindexdb.strings-dir to get the directory path.

If you wanted to find where Logger is defined instead, you can replace the reference table with symbol in steps 2 and 3.

Next steps #

I ultimately want to be able to modify the index outside of Xcode (e.g. move the project to a different location on disk or share it across multiple machines without triggering a reindex). Stay tuned!

 
36
Kudos
 
36
Kudos

Now read this

PNG Manipulation in Ruby

A few years ago, I was working on a project building an internal App Store for a client. The backend was written using Ruby on Rails. I was giving a demo to some other engineers in the company when I noticed that app icons were failing... Continue →