Deciphering Xcode’s index
At work we’re having to wait an inordinate amount of time for Xcode to finish indexing our rather large Swift project. I’ve consequently spent a lot of time over the past few weeks digging into the internals of indexing. This is more or less a brain dump of what I’ve discovered thus far.
File structure #
Xcode’s index is broken into a number of files, located in {project derived data}/Index/{build config}/{platform}/{project}.xcindex/
:
db.xcindexdb
db.xcindexdb-shm
db.xcindexdb-wal
db.xcindexdb.strings-cmp
db.xcindexdb.strings-dir
db.xcindexdb.strings-file
db.xcindexdb.strings-moduleurl
db.xcindexdb.strings-res
db.xcindexdb.strings-sym
SQLite files #
db.xcindexdb
: The index’s SQLite database, which we’ll cover belowdb.xcindexdb-shm
: The SQLite shared-memory filedb.xcindexdb-wal
: The SQLite write-ahead log
Strings files #
These files are collections of strings separated by 0x00
(with a leading and trailing 0x00
). These strings are referenced by rows in the SQLite database. Why they aren’t in the SQLite database is anyone’s guess. ?
The references in the database are integer offsets to the start of the string in these files (presumably for performance reasons).
db.xcindexdb.strings-cmp
: ??db.xcindexdb.strings-dir
: paths to directories referenced in/used by the project (e.g. the project bundle,$SRCROOT
, the project’s derived data directory, system frameworks, etc)db.xcindexdb.strings-file
: both the regular and lowercase names of files used in the project (e.g.AppDelegate.swift
andappdelegate.swift
)db.xcindexdb.strings-moduleurl
: URLs to Xcode modules, including compiler arguments (e.g.x-xcode-module://Swift?compilerArgs=-sdk%20/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.12.sdk%20-target%20x86_64-apple-macosx10.12&language=Xcode.SourceCodeLanguage.Swift
)db.xcindexdb.strings-res
: Unified Symbol Resolution (USR) — identifies a specific entity (e.g. class, function, etc). Note that these names are mangled.db.xcindexdb.strings-sym
: symbol names
SQLite database schema #
Open up the db.xcindexdb
file in the SQLite command line shell and get the list of tables:
$ sqlite3 db.xcindexdb
sqlite> .tables
context group_ language reference unit
file kind provider symbol
Let’s dig into each of these. Keep in mind that the files, directories, and other strings referenced in the database are stored externally, in the aforementioned strings files. (Again, why they aren’t in the db is anyone’s guess.)
Note: Your output may differ from mine. I’m using the index files for a simple Mac app project with a couple of source files, to keep it simple.
Also, the comments in the SQL table definitions are mine.
kind #
The kind
table appears to hold the various possible token types.
sqlite> .schema kind
CREATE TABLE kind(
id INTEGER PRIMARY KEY AUTOINCREMENT,
identifier TEXT NOT NULL
);
sqlite> SELECT * FROM kind;
id|identifier
1|Xcode.SourceCodeSymbolKind.IBOutlet
2|Xcode.SourceCodeSymbolKind.GlobalVariable
3|Xcode.SourceCodeSymbolKind.Global
4|Xcode.SourceCodeSymbolKind.ToDo
5|Xcode.SourceCodeSymbolKind.Callable
6|Xcode.SourceCodeSymbolKind.StaticProperty
7|Xcode.SourceCodeSymbolKind.BuiltinType
8|Xcode.SourceCodeSymbolKind.FunctionTemplate
9|Xcode.SourceCodeSymbolKind.StaticMethod
10|Xcode.SourceCodeSymbolKind.Member
...
provider #
The provider
table appears to hold the various sources of index data (e.g. Clang, SourceKit, etc).
sqlite> .schema provider
CREATE TABLE provider (
id INTEGER PRIMARY KEY AUTOINCREMENT,
identifier TEXT NOT NULL,
version TEXT NOT NULL
);
sqlite> SELECT * FROM provider;
id|identifier|version
1|Xcode.IDEFoundation.Index.DataSource.Unknown|1.0
2|Xcode.Swift.Index.DataSource|3
3|Xcode.IDEFoundation.Index.DataSource.auxiliaryFiles|1.1
4|Xcode.IDEFoundation.Index.DataSource.clang-module|1
5|Xcode.IDEFoundation.Index.DataSource.clang|1
language #
language
is the list of programming languages about which Xcode knows.
sqlite> .schema language
CREATE TABLE language (
id INTEGER PRIMARY KEY AUTOINCREMENT,
identifier TEXT NOT NULL
);
sqlite> SELECT * FROM language;
id|identifier
1|Xcode.SourceCodeLanguage.C
2|Xcode.SourceCodeLanguage.CSS
3|Xcode.SourceCodeLanguage.JSON
4|Xcode.SourceCodeLanguage.Metal
5|Xcode.SourceCodeLanguage.BourneShellScript
6|Xcode.SourceCodeLanguage.XML
7|Xcode.SourceCodeLanguage.OpenCL
8|Xcode.SourceCodeLanguage.XcodeStrings
9|Xcode.SourceCodeLanguage.C-Plus-Plus
10|Xcode.SourceCodeLanguage.Objective-C
...
file #
file
, unsurprisingly, represents a file. Again, remember that the filenames and directories are offsets into the strings files.
sqlite> .schema file
CREATE TABLE file (
id INTEGER PRIMARY KEY AUTOINCREMENT,
-- offset in db.xcindexdb.strings-file
lowercaseFilename INTEGER NOT NULL,
-- offset in db.xcindexdb.strings-file
filename INTEGER NOT NULL,
-- offset in db.xcindexdb.strings-dir
directory INTEGER NOT NULL,
-- offset in db.xcindexdb.strings-moduleurl
-- 0 = project's module?
moduleurl INTEGER NOT NULL,
inProject INTEGER NOT NULL
);
CREATE INDEX file_lowercaseFilename_index ON file (lowercaseFilename);
group_ #
sqlite> .schema group_
CREATE TABLE group_ (
id INTEGER PRIMARY KEY AUTOINCREMENT,
file INTEGER NOT NULL, -- foreign key to `file` table
signature TEXT NOT NULL,
signature_inBody TEXT NOT NULL,
provider INTEGER NOT NULL -- foreign key to `provider` table
);
CREATE INDEX group_index ON group_ (file, signature);
unit #
unit
represents a compilation unit??
sqlite> .schema unit
CREATE TABLE unit (
id INTEGER PRIMARY KEY AUTOINCREMENT,
file INTEGER NOT NULL, -- foreign key to `file` table
target TEXT NOT NULL, -- path to Xcode target
provider INTEGER NOT NULL, -- foreign key to `provider` table
pchFile INTEGER
);
CREATE UNIQUE INDEX unit_index ON unit (file, target);
CREATE INDEX unit_target_index ON unit (target);
CREATE INDEX unit_provider_index ON unit (provider);
The target
column has (at least) three possible values:
{path}
: Full path to an Xcode target. It should follow this format:{path to directory containing Xcode project}/{target name}-{Xcode target UUID}
(e.g./Users/tyler/Sample/Sample-0BC905E01DF74955009CC0E6
)<aux>
<mod>
context #
sqlite> .schema context
CREATE TABLE context (
unit INTEGER NOT NULL, -- foreign key to `unit` table
group_ INTEGER NOT NULL, -- foreign key to `group_` table
includer INTEGER,
-- modified time relative to 00:00:00 UTC on 1 January 2001
-- (NSDate's reference date)
modified REAL,
spliced INTEGER DEFAULT 0
);
CREATE UNIQUE INDEX context_index ON context (unit, group_);
CREATE INDEX context_group_index ON context (group_);
CREATE INDEX context_includer_index ON context (includer);
symbol #
symbol
contains the symbols in your code.
sqlite> .schema symbol
CREATE TABLE symbol (
id INTEGER PRIMARY KEY AUTOINCREMENT,
-- offset in db.xcindexdb.strings-sym
spelling INTEGER NOT NULL,
-- offset in db.xcindexdb.strings-sym
lowercaseSpelling INTEGER NOT NULL,
kind INTEGER, -- foreign key to `kind` table
role INTEGER NOT NULL,
language INTEGER, -- foreign key to `language` table
resolution INTEGER,
group_ INTEGER NOT NULL, -- foreign key to `group_` table
lineNumber INTEGER, -- starts at 1
column INTEGER, -- starts at 1
locator TEXT,
container INTEGER,
completionString INTEGER
);
CREATE INDEX symbol_lowercaseSpelling_index ON symbol (lowercaseSpelling);
CREATE INDEX symbol_resolution_index ON symbol (resolution);
CREATE INDEX symbol_kind_index ON symbol (kind);
CREATE INDEX symbol_group_index ON symbol (group_);
CREATE INDEX symbol_container_index ON symbol (container);
reference #
reference
contains the references in your code to symbols.
sqlite> .schema reference
CREATE TABLE reference (
id INTEGER PRIMARY KEY AUTOINCREMENT,
-- offset in db.xcindexdb.strings-sym
spelling INTEGER NOT NULL,
-- offset in db.xcindexdb.strings-sym
lowercaseSpelling INTEGER NOT NULL,
kind INTEGER, -- foreign key to `kind` table
role INTEGER NOT NULL,
language INTEGER, -- foreign key to `language` table
resolution INTEGER,
group_ INTEGER NOT NULL, -- foreign key to `group_` table
lineNumber INTEGER, -- starts at 1
column INTEGER, -- starts at 1
locator TEXT,
container INTEGER,
receiver INTEGER
);
CREATE INDEX reference_lowercaseSpelling_index ON reference (lowercaseSpelling);
CREATE INDEX reference_resolution_index ON reference (resolution);
CREATE INDEX reference_group_index ON reference (group_);
CREATE INDEX reference_container_index ON reference (container);
Examples #
OK, now that we’ve got the basic structure down, let’s do some lookups. We’re going to be working from a basic macOS project with the following code:
// Logger.swift
final class Logger {
func log(message: Message) {
print(message)
}
func dump() -> String {
return "Hello, world!"
}
}
// Message.swift
struct Message: CustomStringConvertible {
let content: String
let file: String
let line: Int
var description: String {
return "\(file):\(line) -- \(content)"
}
}
// main.swift
let logger = Logger()
logger.log(message: Message(content: "Hello, world!", file: #file, line: #line))
Challenge #
Find all usages of the Logger
class in our code.
Solution #
Step 1 #
Find the location of the string “Logger” in the .strings-sym
file.
$ strings -a -t d db.xcindexdb.strings-sym | grep "^[0-9]* Logger$"
8 Logger
Step 2 #
Find its record in the reference
table.
$ sqlite3 db.xcindexdb
sqlite> SELECT * FROM reference
...> WHERE spelling = 8;
id|spelling|lowercaseSpelling|kind|role|language|resolution|group_|lineNumber|column|locator|container|completionString
11719|8|1|5|4|35|93|1355|9|14|||
This gives us all the places Logger
is referenced (only one, currently).
Step 3 #
To find the files corresponding to these references, join across group_
to file
:
sqlite> SELECT f.id,f.filename,f.directory
...> FROM file f
...> INNER JOIN group_ g ON (g.file = f.id)
...> INNER JOIN reference r ON (r.group_ = g.id)
...> WHERE r.spelling = 8;
id|filename|directory
1|1|1
We can now jump to offset 1
in db.xcindexdb.strings-file
to get the filename where Logger
is referenced (main.swift
) and jump to offset 1
in db.xcindexdb.strings-dir
to get the directory path.
If you wanted to find where Logger
is defined instead, you can replace the reference
table with symbol
in steps 2 and 3.
Next steps #
I ultimately want to be able to modify the index outside of Xcode (e.g. move the project to a different location on disk or share it across multiple machines without triggering a reindex). Stay tuned!