TIBCO Spotfire Community

Welcome to TIBCO Spotfire Community Sign in | Join | Help

Your daily commits from subversion using SQL Server xml bulk import

I wanted to get a good overview of our subversion commits and found that I can get the some of the data from subversion in xml using --xml flag. To get the actual data into a database without writing to much code I found a good way of bulk importing xml files to SQL Server which seemed straight forward. The SQLXML bulk load uses an XSD to interpret the information in the XML file to be imported, by adding more information to the XSD the xml bulk load can load data to multiple tables and keeping referential integrity.

 
  1. Create a .Net program that encapsulate the COM component

  2. Create an XSD

  3. Create a bat file

  4. Create an information link and add properties for automatic links

 

1. .Net program

/// <summary>From: Using SQLXML Bulk Load in the .NET Environment
/// http://msdn.microsoft.com/en-us/library/ms171878.aspx
/// </summary>
public class Program
{
[STAThread] // Must be STA for SQLXMLBulkLoad to work.
public static int Main(string[] args)
{
if (args.Length < 4)
{
Console.WriteLine("Bulk import of subversion log file");
Console.WriteLine("SubversionLogImport.exe <connectionstring> <schemafile> <datafile> <errorfile>");
return 1;
}

// "Provider=sqloledb;server=server;database=databaseName;integrated security=SSPI";
string connectionString = args[0];

// "schema.xml"
string schemaFile = args[1];

// "data.xml"
string datafile = args[2];

// "error.xml"
string errorFile = args[3];

try
{
SQLXMLBulkLoad4Class xmlBulkLoad = new SQLXMLBulkLoad4Class();
xmlBulkLoad.ConnectionString = connectionString;
xmlBulkLoad.ErrorLogFile = errorFile;
xmlBulkLoad.KeepIdentity = false;
xmlBulkLoad.SchemaGen = true;
xmlBulkLoad.SGDropTables = true; // Remove existing tables
xmlBulkLoad.Execute(schemaFile, datafile);
}
catch (Exception e)
{
Console.WriteLine(e.ToString());
return 1;
}


return 0;
}
}

2. The XSD

All values of SVNLogEntries and SVNLogPaths represent table names. See MSDN: Record Generation Process for information about the details.

<?xml version="1.0" encoding="utf-8"?>
<!-- For "svn log" -->
<!-- -->
<!-- Credits to Lieven Govaerts (http://markmail.org/message/276jubfmfieqdtdr) for creating a starting point -->
<!-- -->
<xsd:schema
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:sql="urn:schemas-microsoft-com:mapping-schema">

<xsd:annotation>
<xsd:appinfo>
<sql:relationship name="LogPathRelationship"
parent="SVNLogEntries"
parent-key="Revision"
child="SVNLogPaths"
child-key="Revision" />
</xsd:appinfo>
</xsd:annotation>

<!-- log: root element -->
<xsd:element name="log" sql:is-constant="1" >
<xsd:complexType>
<xsd:sequence>
<xsd:element name="logentry" type="logentryType" minOccurs="0" maxOccurs="unbounded" sql:relation="SVNLogEntries"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>

<!-- logType: collection of log entries -->
<!-- logentryType: describes one revision -->
<xsd:complexType name="logentryType">
<xsd:sequence>
<xsd:element name="author" type="xsd:string" sql:field="Author" sql:datatype="varchar(30)" />
<xsd:element name="date" type="xsd:dateTime" sql:field="Timestamp" sql:datatype="datetime" />
<xsd:element name="paths" minOccurs="0" sql:is-constant="1">
<xsd:complexType>
<xsd:sequence>
<xsd:element
name="path"
minOccurs="0"
maxOccurs="unbounded"
sql:field="RepositoryPath"
sql:datatype="varchar(255)"
sql:relation="SVNLogPaths"
sql:relationship="LogPathRelationship">
<xsd:complexType>
<xsd:simpleContent>
<xsd:extension base="xsd:string">
<xsd:attribute name="action" type="actionType" use="required" sql:field="Action" sql:datatype="char(1)"/>
<xsd:attribute name="copyfrom-path" type="xsd:string" sql:field="CopyFromRepositoryPath" sql:datatype="varchar(255)"/>
<xsd:attribute name="copyfrom-rev" type="xsd:unsignedInt" sql:field="CopyFromRevision"/>
</xsd:extension>
</xsd:simpleContent>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
<xsd:element name="msg" type="xsd:string" minOccurs="0" sql:field="Message" sql:datatype="varchar(2000)"/>
</xsd:sequence>
<xsd:attribute name="revision" type="xsd:unsignedInt" use="required" sql:field="Revision"/>
</xsd:complexType>

<!-- actionType: one character: A D M or R -->
<xsd:simpleType name="actionType">
<xsd:restriction base="xsd:string">
<xsd:pattern value="[ADMR]"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:schema>

3. The command file

Download a subversion command line client from http://subversion.tigris.org/getting.html#binary-packages

Then create a bat file with the following content, make sure to schedule it every morning

svn log <url> --xml -v > svnlog.xml

SubversionLogImport.exe <connectionstring> svnlog.sqlserverimport.xsd svnlog.xml error.xml

4. Information Links

Create one information link for each of the tables, if ViewVC (or something similar have been setup) make sure to add a LinkTemplate property with the value ‘http://<subversion server>/viewvc?limit_changes=0&view=rev&revision={0}’ on the Revision column elements to have automatic configuration of links in the Table Visualization.

Conclusion

Without to much coding it was fairly easy to import rather large xml files and automate the import every day without to much coding. In later articles I’ll use this data to show some more advanced usage of TIBCO Spotfire with regards to data manipulation.

Comments
 

Knowledge Base said:

A previous post showed how SQL Server bulk import of xml could be used to import subversion commits.

January 15, 2009 2:46 AM

About Daniel Vulcan

Daniel Vulcan is a developer with the TIBCO Spotfire and the TIBCO Spotfire Web Player teams.