Location: United States

A New Yorker relocated to Florida. I have fifteen years in the IT industry, with stints in product management, database management, application programming... I've been a CIO, a consultant, a software evangelist... one of these days I'll write up a proper profile.

Thursday, January 26, 2006

Virtual Folders and Indexing - The Ten Thousand Foot View

I thought that today we could start with a quick summary of press impressions of the virtual folders in Windows Vista.

Stacks of Keyword Searches
Virtual folders are probably the second most significant user interface change (I consider Microsoft's long overdue move away from default administrator logins number one, and yes, besides being a change in security design, it's also a UI design change too!), Yesterday's topic focused on extended file properties/metadata and how they relate to virtual folder support. Today's article will attempt to take more of a ten thousand foot view.

The following articles discuss virtual folders.

To summarize: the press by and large is focusing on saved searches. It's probably not unreasonable to assume that they consider it merely a copycat feature of the Apple smart folder feature. What are they missing? To answer that, consider what searches are currently used for: a web-centric view of text data, (i.e., it's merely a saved text search, so therefore it's a browser-like feature). Most times when searching for a local document, the text content is just the file name!

There's another view: the database-centric view. If virtual folders are to be a central feature of WinFS, the database/file system, we should look carefully at where the virtual folder feature may be going. If you have structured storage, you can have structured queries. A small case in point: take a look at the Windows attributes for a JPEG file: the dimensions are a string, such as "1024 x 768." Can't sort on the height of a picture? In Windows Vista, there is a width column and a height column. How much would you like to bet we'll be able to query them as numbers when WinFS arrives?

Consider also the "stock list of well known property types" which Explorer already supports, such as keywords (see Filtering Well-Known Properties for more information).

The whole purpose of the IFilter interface described in that article is so that the Windows index service can make a search for files with a specified property (for example, a keyword such as "contract") instantaneous. Currently (Windows XP) the indexing service does not index network storage by default; but you can add a network location, or you can configure the indexing service to query a remote systems' indexing service. This weblog will dig deeper into the indexing service as it applies to external files linked into a virtual folder in a future article. In Windows Vista beta two, at least, you cannot drag a file from a network folder into a stack, nor if you add a keyword to a network file, will the network file appear in the virtual folder for that keyword. To learn more about how network indexing currently works, see the article Using Microsoft Indexing Service to search for files in Network by Boris Goussakov."

But for the future, if it should be implemented, structured searches that intelligently query local and select external indices have tremendous applications. For example, given that many files in a corporate environment are housed on non-local drives, this could save a tremendous amount of time searching for documents, especially if the indices could be updated overnight or during idle time. For the home user, many of whom have never quite mastered the hierarchical file structure, this could make the "where did I put that file" question go away once and for all.

The biggest problem in implementing this, besides shipping WinFS, may be classifying "legacy" files for which no properties were defined at the time of file creation. I just checked my "My Documents," "Shared Documents, and my external drive. Not counting the folder on the external drive for backup, I've got nearly 200,000 files in those places. Clearly, I'll have to add keywords to many, many files at once if I ever hope to catch up. I guess this means that for me to gain a "Vista" upon my files, it will take much more of my time than an operating system upgrade!

Windows XP File Properties/Metadata
For new files, it's probably imperative that users learn to fill in the extended properties dialogs. Microsoft Office provides a means of automatically prompting the user to fill in title, subject, keywords etc. when saving a file for the first time. In yesterday's topic, I suggested it might be a good idea to add these to the Save As common dialog.

Many file types might have attributes that don't fit into the "well known property types" or their developers may wish to expose text within the file to indexed text searches. See the article mentioned yesterday (Be Discoverable). For applications that define their own file formats, an IPropertyStore interface can manage custom property types. It's not difficult to imagine a custom file format for a print application, for example, that stores a "target press" property that could then be searchable.

And of course, providing protocol handlers for exposing text within a custom file format is a main concern of "Be Discoverable" as well. It appears that what Microsoft is aiming for is to meet and exceed competition in text search, and perhaps most importantly, move users away from a hierarchical file system by encouraging file search by easily identifiable metadata properties and attributes. I have not yet confirmed this, but it appears the search box in the Start menu utilizes the index service (either that, or it heavily caches the virtual folder stacks!).

Searching for a keyword from the Start Menu search box
This is a very major change from text based search or tags, and probably a harbinger of WinFS. It appears that there may be a big difference between a saved query (Microsoft's term used in the Windows Vista Self Guided Tour) and a saved full text search.

All trademarks are properties of their respective owners. Windows is a registered trademark of Microsoft Corporation in the United States and other countries. This weblog is an independent publication and is not affiliated with, nor has it been authorized, sponsored, or otherwise approved by Microsoft Corporation. Apple is a registered trademark of Apple Computer.


Anonymous Anonymous said...

Well done!
[url=]My homepage[/url] | [url=]Cool site[/url]

3:59 PM  
Anonymous Anonymous said...

Good design!
My homepage | Please visit

3:59 PM  
Anonymous Anonymous said...

Well done! |

4:00 PM  

Post a Comment

<< Home