Pages

Friday, August 14, 2015

How to Use xp_dirtree to List All Files in a Folder

undocumented extended stored procedure; master.sys.xp_dirtree.  This stored procedure will display a list of every folder, every subfolder, and every file for path you give it.

Xp_dirtree has three parameters: 
  1. directory - This is the directory you pass when you call the stored procedure; for example 'D:\Backup'.
  2. depth  - This tells the stored procedure how many subfolder levels to display.  The default of 0 will display all subfolders.
  3. file - This will either display files as well as each folder.  The default of 0 will not display any files.
For today's example, we just want to display all of our backup files (*.BAK) in a particular folder. We need to all of the other parameters in order to show the files as well as any subfolders.
EXEC master.sys.xp_dirtree 'D:\Backup\TRON4\TEST2\MyDb1',0,1;

The output below will show us each subfolder and every file for the given directory.
We're only concerned about the BAK files.  So we'll need to create a temp table to hold the output of xp_dirtree.  Then we can select only the rows that are an actual file (isfile = 1) and that have a file extension of BAK.

IF OBJECT_ID('tempdb..#DirectoryTree') IS NOT NULL
      DROP TABLE #DirectoryTree;
CREATE TABLE #DirectoryTree (
       id int IDENTITY(1,1)
      ,subdirectory nvarchar(512)
      ,depth int
      ,isfile bit);
INSERT #DirectoryTree (subdirectory,depth,isfile)
EXEC master.sys.xp_dirtree 'D:\Backup\TRON4\TEST2\MyDb1',1,1;
SELECT * FROM #DirectoryTree
WHERE isfile =AND RIGHT(subdirectory,4) = '.BAK'
ORDER BY id;
GO

Below is the output.
This example only shows the files for a specific subfolder, which is used to get a list of files that are needed for purging before another database backup runs.  This stored procedure also can accept UNC paths (\\Server\Folder) to get a list of files and subfolders from a network drive.  In a future post, I will add on to this script to show all files in all subfolders for a given directory.  Until then, play around with xp_dirtree and its parameters to see the different outputs.  You may be able to incorporate this into your own code.
 
 
I demonstrated how you can use an undocumented stored procedure, master.sys.xp_dirtree, to list all files in a folder.  The previous posting mostly talked about how the stored procedure worked and the output it generated.  For this posting, I have expanded upon the script to parse all subdirectories and list all files in all subfolders from the starting point.
Just for a quick review, xp_dirtree has three parameters:
  1. directory - This is the directory you pass when you call the stored procedure; for example 'D:\Backup'.
  2. depth  - This tells the stored procedure how many subfolder levels to display.  The default of 0 will display all subfolders.
  3. isfile - This will either display files as well as each folder.  The default of 0 will not display any files.
Now for the updated script:

DECLARE
       @BasePath varchar(1000)
      ,@Path varchar(1000)
      ,@FullPath varchar(2000)
      ,@Id int;
--This is your starting point.
SET @BasePath = 'D:\Backup';
--Create a temp table to hold the results.
IF OBJECT_ID('tempdb..#DirectoryTree') IS NOT NULL
      DROP TABLE #DirectoryTree;
CREATE TABLE #DirectoryTree (
       id int IDENTITY(1,1)
      ,fullpath varchar(2000)
      ,subdirectory nvarchar(512)
      ,depth int
      ,isfile bit);
--Create a clustered index to keep everything in order.
ALTER TABLE #DirectoryTree
ADD CONSTRAINT PK_DirectoryTree PRIMARY KEY CLUSTERED (id);
--Populate the table using the initial base path.
INSERT #DirectoryTree (subdirectory,depth,isfile)
EXEC master.sys.xp_dirtree @BasePath,1,1;
UPDATE #DirectoryTree SET fullpath = @BasePath;
--Loop through the table as long as there are still folders to process.
WHILE EXISTS (SELECT id FROM #DirectoryTree WHERE isfile = 0)
BEGIN
      --Select the first row that is a folder.
      SELECT TOP (1)
             @Id = id
            ,@FullPath = fullpath
            ,@Path = @BasePath + '\' + subdirectory
      FROM #DirectoryTree WHERE isfile = 0;
      IF @FullPath = @Path
      BEGIN
            --Do this section if the we are still in the same folder.
            INSERT #DirectoryTree (subdirectory,depth,isfile)
            EXEC master.sys.xp_dirtree @Path,1,1;
            UPDATE #DirectoryTree
            SET fullpath = @Path
            WHERE fullpath IS NULL;
            --Delete the processed folder.
            DELETE FROM #DirectoryTree WHERE id = @Id;
      END
      ELSE
      BEGIN
            --Do this section if we need to jump down into another subfolder.
            SET @BasePath = @FullPath;
            --Select the first row that is a folder.
            SELECT TOP (1)
                   @Id = id
                  ,@FullPath = fullpath
                  ,@Path = @BasePath + '\' + subdirectory
            FROM #DirectoryTree WHERE isfile = 0;
            INSERT #DirectoryTree (subdirectory,depth,isfile)
            EXEC master.sys.xp_dirtree @Path,1,1;
            UPDATE #DirectoryTree
            SET fullpath = @Path
            WHERE fullpath IS NULL;
            --Delete the processed folder.
            DELETE FROM #DirectoryTree WHERE id = @Id;
      END
END
--Output the results.
SELECT fullpath + '\' + subdirectory AS 'CompleteFileList'
FROM #DirectoryTree
ORDER BY fullpath,subdirectory;
--Cleanup.
IF OBJECT_ID('tempdb..#DirectoryTree') IS NOT NULL
      DROP TABLE #DirectoryTree;
GO

This code will process one folder level at a time since we're specifying 1 for the depth parameter.  For each folder in the #DirectoryTree table, we get a list of files and subfolders and insert it to the #DirectoryTree table. 
INSERT #DirectoryTree (subdirectory,depth,isfile)
EXEC master.sys.xp_dirtree @Path,1,1;

If it's a folder, then it the file parameter will be 0 and the WHILE loop will process once it gets further down the loop.  The folder needs a clustered index to keep all the rows in order.
ALTER TABLE #DirectoryTree
ADD CONSTRAINT PK_DirectoryTree PRIMARY KEY CLUSTERED (id);

After each folder is processed, the folder row is deleted from the table.  This is necessary so the WHILE loop doesn't turn into an infinite loop.
DELETE FROM #DirectoryTree WHERE id = @Id;

The full path is being saved for output later, so comparing it to the current base path will let us know if we need to update the @BasePath parameter.
IF @FullPath = @Path
      --Continue processing the same folder.
ELSE
      SET @BasePath = @FullPath;
      --Start processing the next subfolder.

This is how we can keep looping through each subfolder until each one has been processed.

The final step is to output the results.

SELECT fullpath + '\' + subdirectory AS 'CompleteFileList'
FROM #DirectoryTree
ORDER BY fullpath,subdirectory;
I have gotten pretty good performance out of this script. 

No comments:

Post a Comment