Scoped tags support for Javascript, in Vim

Tags files

How do you navigate a Javascript project in Vim? Well, you generate a tags file, and use Vim’s extensive support for tags-based commands. A good Javascript-aware tags file generator is DoctorJS’s jsctags (itself written in Javascript).

Problem solved? Not quite. Javascript programming patterns tend to use nested declarations (module pattern, immediately applied function definitions for local namespaces, or just plain old local variable declarations and function parameters). Tags files were designed for languages with mostly global definitions, so if these local definitions show up at all in a tags file, there will be no scope-awareness (no navigation to the definition currently in scope).

Ok, so we need to see about extending the tags file information. The tags file format used in Vim is extensible, so we can add fields. Such added fields are already in use, for instance, to indicate the class or namespace of properties in object-oriented languages. DoctorJS, in earlier versions, used to generate such namespace information for Javascript tags (that feature seems to be broken in recent versions).

Scoped tags files

The difference here is that we want to support lexical scoping, not namespacing. Scoping differs quite widely between programming languages, even between language versions (Javascript’s function scoping is going to be accompanied by other scoping constructs in ES6). To avoid hard-coding language-specific scoping conventions into Vim, we use a language-independent form of indicating variable scope in a tags file, via a tag field

scope:startline:startcolumn-endline:endcolumn

It is then up to Vim scripts to figure out the matching tag for an identifier under the cursor — a tag with the smallest scope range wrapping the cursor position (smallest to account for shadowing of bindings), and the work of translating the language scoping rules to scope ranges is left to language-specific tags file generators.

Generating scope-aware tags for Javascript requires a Javascript parser, augmented with a scope analysis (that also takes care of such nasty things as declaration hoisting). Fortunately, DoctorJS already fits the bill, and since it is implemented in Javascript it tends to be useable where Javascript code is under development (I use it via node, on Windows; the patches to make that work are in the DoctorJS ticket tracker). So, I’ve patched DoctorJS to output scoped tags, including tags for local variables, and I’ve written a Vim script to utilise such scope-aware tags files for navigation support:

Please consider these doctorjs/narcissus forks as temporary branches — I’ve put them up so that you can play with scope-aware tags for Javascript right now, so they include both patches to fix DoctorJS issues and scoped-tags functionality, but -depending on how and which of these patches get applied to the original repos- the commit history of my branches might change to follow suit.

Generating and using scoped tags

Enough general explanations, let’s try it out. First, download and installation (assuming you’ve got nodejs and Vim set up, as well as git and a shell):

$ mkdir scoped_tags_test
$ cd scoped_tags_test/
$ TESTINSTALL=`pwd`
$ git clone git://github.com/clausreinke/scoped_tags.git
$ git clone git://github.com/clausreinke/doctorjs.git
$ cd doctorjs/
$ git submodule update --init --recursive
$ make install PREFIX=$TESTINSTALL/jsctags
$ export NODE_PATH=$TESTINSTALL/jsctags/lib/jsctags/
$ cd ..

For testing, I’ve got a little file with nested definitions (it doesn’t do anything useful, but it has nested scopes and hoisted declarations),

function log(x) { }
var z;
function A() {
  var x;
  var y = [x,function(x) {
                log(z.a);
                function A() {
                  log(z.b);
                  var x;
                  var z;
                  return x;
                }
                return A(x);
              }];
  var z = [function(x) { return (function(x) { return x; }(function(x) { return x; }))(x); },x];
  return x;
}
A();

which we’ll copy into our test installation as tst.js. Then we run jsctags over it:

$ node.exe jsctags/bin/jsctags --sort=yes --locals tst.js

and get a text file tags that includes lines such as

z       tst.js  /^                  var z;$/;"  v       lineno:10       scope:7:31-12:18        type:(local)
z       tst.js  /^  var z = [function(x) { return (function(x) { return x; }(function(x) { return x; }))(x); },x];$/;"  v       lineno:15       scope:3:15-17:2 type:Array[any function(any)]
z       tst.js  /^var z;$/;"    v       lineno:2        scope:1:1-19:1  type:any 

indicating three separate declarations of variable z, with different line numbers and scope ranges.

We can now open tst.js in Vim, and source the Vim script for processing scoped tags (for non-testing use, you’d copy scoped_tags.vim into one of Vim’s autoload directories, and copy the key binding code into your vimrc file — the default key bindings are optional, so feel free to define your own)

:source scoped_tags/autoload/scoped_tags.vim
:call scoped_tags#DefaultKeyBindings()

Try moving the cursor on some of the variables in the file, then hit _] to go to the definition, or _* / _# to move to the next/previous variable occurrence (try activating Vim’s standard search highlighting :set hlsearch to mark all occurrences in scope). Note how navigation is aware of hoisting and nesting of bindings, as in the following screenshots.
scoped tags test scoped tags testscoped tags test

Neither conventional tags generators nor conventional search handle scopes, so this is new functionality (in Vim, as far as I know;-). Note that you can still use the standard tags commands with the extended tags files, though you’ll usually have multiple matching tags to select from or iterate through.

On to more interesting things: let’s generate tags for the jsctags sources we’ve just installed. We direct the output with our new option --locals to tags.locals, so that we can compare with jsctags‘ default output, in tags (the --locals variant should have nearly four times as many tags):

$ cd jsctags/
$ node.exe bin/jsctags --sort=yes --locals -ftags.locals -Llib -Lnarcissus\\lib lib narcissus
$ node.exe bin/jsctags --sort=yes -Llib -Lnarcissus\\lib lib narcissus

Translation: we’re using node.exe to run bin/jsctags, to generate sorted tags files, tags.locals with tags for local variables, tags without; the -L flags indicate the top directories of libraries, for module handling; and we process all .js files in the lib and narcissus directory trees. The backslashes in -Lnarcissus\\lib should be replaced with a single forward slash on non-Windows systems (we could add bin/jsctags to the files to be processed, but we’d need to comment out its first line, which isn’t Javascript).

As before, we need to load our script for scoped tags navigation. Since we’re now dealing with a multi-module project spanning several directories, we also need to make sure that Vim can find the project-wide tags file, no matter which file we’re editing:

:source scoped_tags/autoload/scoped_tags.vim
:call scoped_tags#DefaultKeyBindings()
:set tags=./tags.locals;

The brief screencast linked above shows how standard tags and search (with Vim’s built-in support) combine with scoped tags and search (supported by the scoped_tags.vim script) as a natural extension of Vim’s code navigation. Apart from parsing speed and language coverage (via narcissus), building scoped tags generation into DoctorJS‘s jsctags stands to profit from further developments in DoctorJS‘s type and property membership inference. I hope you’ll find this as useful as I do!-)

Btw, a word in closing: the scoped tags format and Vim support are intentionally language-independent; all you need if you want to reuse it for your own favourite language is a scope-tracking parser for that language – then just add source span information and link declarations to their scopes, and you’re all set for generating scoped tags!

Update (18/06/2012): I’m now using my new estr toolsuite to generate scoped tags, instead of the modified doctorjs. Estr is in active development, and easier to install.

Update (20/03/2014):

  • a reminder that I no longer support my patched doctorjs (the original project has been unsupported for years)
  • I no longer use tags files for JavaScript/TypeScript at all – I recommend using ternjs or typescript-tools instead, they support jump-to-definition directly
  • About these ads
This entry was posted in Uncategorized. Bookmark the permalink.

9 Responses to Scoped tags support for Javascript, in Vim

  1. I got up to
    node jsctags/bin/jsctags –sort=yes –locals tst.js
    but when I ran it I got this message
    The “sys” module is now called “util”. It should have a similar interface.

    • The “sys” module was apparently renamed and deprecated long ago, but the deprecation warning was suppressed. From node v0.6.4 on, this changed. I have pushed a change to my doctorjs version, renaming all “sys” imports to “util”. Seems to work now, with node v0.6.12, but I’ve only run a quick test.

      • Thanks, I didn’t get any errors this time, but if I press _] it just goes to the end of the file. _[ goes to the beginning. _* and _# work though.

        • Strange – it works for me, and _[ shouldn’t be mapped to anything. That suggests that someone else remaps _[ and _] after you call scoped_tags#DefaultKeyBindings(). If :map _] still shows scoped_tags#GotoTagDefinition(..), then please open a ticket on the issue tracker . That is more suitable for detailed discussions than the comments here:-)

  2. majutsushi says:

    This is a pretty neat idea. There’s just one thing I disagree with: in my opinion the extension field should be called ‘range’ rather than ‘scope’, otherwise it’s too easy to confuse it with the scope that the tag is in (like the class or namespace name). For example, in Tagbar I parse the scope-describing fields like ‘class:’ into a general ‘scope’ field, so it would be rather confusing to have two fields with the same name but different meaning.

    • Thanks. Yes, the confusion is sad, but ‘scope’ is the standard CS term here, while ‘class’, ‘interface’, or ‘type’ might have been used for the ‘is this property access going to work’ use. doctorjs’s namespace inference has been broken for so long that I’ve given up hope for bringing the two together. Perhaps it is time for newer JS analysis tools to return to standard terminology? On the other hand, I understand that you need to support the choices made by tag generators, whether or not you agree with them.

      • majutsushi says:

        It’s not really about doctorjs choices in particular, ctags does the same thing with regard to the “class:” fields etc., and I think that’s actually the right thing to do. Those fields give information that tell you something about the semantics of the tag like “this tag appears in the scope of class X”, independent of how the information is actually represented in the source files. Your “scope:” field however gives the physical range of where the tag is valid *in the source file* and only acquires its semantic meaning of “scope” if it gets interpreted correctly by Vim (or any other compatible program). Therefore I think it would make sense if the field name reflects this difference between physicality and semantics. It’s a relatively minor issue of course, but I do think it would be more consistent.

        • Actually, neither doctorjs nor ctags use ‘scope’. Ctags uses ‘class’ for class members, it does not support lexical scope at all. Some languages introduce class members into the scope chain (I think C++ members are directly accessible in class methods), some don’t (since ‘with’ is deprecated, JS separates object property accessibility from variable scope). This is not a question of how the information is represented in the tags file, it is a question of whether we are talking of variables and their scope or object/class members and their accessibility – two separate concepts. The ‘scope’ field in my doctorjs-based tags generator describes lexical scope (the only reason for using a range-based representation is that this is something Vim can understand and handle, independent of source language). If that causes problems for tagbar users, I could add a field-renaming option, but I’d rather stick to standard terminology here.

        • majutsushi says:

          I know what scope is, and this is not really a Tagbar issue. Yes, your range information does refer to the lexical scope once interpreted. My point is just that calling it by the semantic name “scope” seems to imply that it’s independent of changes to the physical representation of the code in a source file. For example, if I add or delete some lines at the top of the file, the actual lexical scope of a tag (as interpreted by the language) does not change, but the information in your “scope:” field is no longer accurate and could even refer to a different class or something else (it’s of course not very likely to line up perfectly somewhere else, but it’s theoretically possible), at least until the tag file gets updated. So all I’m saying is that in my opinion a name that reflects the fact that the field contains information about the *physical representation* of the scope, and not about the semantics (like the name of the enclosing scope), would make more sense. Maybe a name like “scoperange” would be best so the meaning is completely clear.

Comments are closed.