Transformasjon fra XML til SQL
Versjon 1
Vi skriver en transformasjon med outputmetode text.
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="text" encoding="UTF-8"/> <xsl:template match="/"> USE bsdiverse; DROP TABLE IF EXISTS olresults; CREATE TABLE olresults( resultid INT PRIMARY KEY AUTO_INCREMENT, place VARCHAR(20), year INT, event VARCHAR(10), name VARCHAR(35), nation VARCHAR(10), result VARCHAR(10) ); <xsl:apply-templates select="IOC/OlympicGame/event/athlet/result"/> </xsl:template> <xsl:template match="//result"> INSERT INTO olresults(place,year,event,name,nation,result) values( '<xsl:value-of select="ancestor::athlet/ancestor::event/ancestor::OlympicGame/@place"/>', '<xsl:value-of select="ancestor::athlet/ancestor::event/ancestor::OlympicGame/@year"/>', '<xsl:value-of select="ancestor::athlet/ancestor::event/@dist"/>', '<xsl:value-of select="ancestor::athlet/name"/>', '<xsl:value-of select="ancestor::athlet/nation"/>', '<xsl:value-of select="ancestor::athlet/result"/>'); </xsl:template> </xsl:stylesheet>
Vi forutsetter at databasen (olympic) er opprettet, fjerner tabellen results, oppretter den på ny og fyller den fra XML-dataene.
Resultatet av transformasjonen (et utdrag) er slik:
USE olympic: DROP TABLE IF EXISTS results; CREATE TABLE results( resultid INT PRIMARY KEY AUTO_INCREMENT, place VARCHAR(20), year INT, event VARCHAR(10), name VARCHAR(35), nation VARCHAR(10), result VARCHAR(10) ); INSERT INTO results(place,year,event,name,nation,result) values( 'Sidney', '2000', '100m', 'Maurice Greene', 'USA', '9.87'); INSERT INTO results(place,year,event,name,nation,result) values( 'Sidney', '2000', '100m', 'Obadele Thompson', 'BAR', '10.04'); INSERT INTO results(place,year,event,name,nation,result) values( 'Sidney', '2000', '100m', 'Dwain Chambers', 'GBR', '10.08'); ...
Vi kan så bruke denne fila, dbresult.txt, mot en database. F.eks. i følgende dialog i et dos-vindu:
C:\temp\xsldemo>mysql -u root -p Enter password:****** Welcome to the MySQL monitor. Commands end with . or \g. Your MySQL connection id is 3 to server version: 5.0.15-nt Type 'help' or '\h' for help. Type '\c' to clear the buffer mysql>source dbresult.txt;
Versjon 2
Vi beholder den samme databasestrukturen, og endrer transformasjonen som lager SQL. Endringene er to:
- Vi bruker kortformen ../ for å angi foreldre noden.
- Vi lage en samlet SQL-setning for insert oppgaven.
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="text" encoding="UTF-8"/> <xsl:template match="/"> USE bsdiverse; DROP TABLE IF EXISTS olresults; CREATE TABLE olresults( resultid INT PRIMARY KEY AUTO_INCREMENT, place VARCHAR(20), year INT, event VARCHAR(10), name VARCHAR(35), nation VARCHAR(10), result VARCHAR(10) ); INSERT INTO olresults(place,year,event,name,nation,result) values <xsl:apply-templates select="IOC/OlympicGame/event/athlet/result"/> ; </xsl:template> <xsl:template match="//result"> ( '<xsl:value-of select="../../../@place"/>', '<xsl:value-of select="../../../@year"/>', '<xsl:value-of select="../../@dist"/>', '<xsl:value-of select="../name"/>', '<xsl:value-of select="../nation"/>', '<xsl:value-of select="."/>' ) <xsl:if test="not( position() = last())">,</xsl:if> </xsl:template> </xsl:stylesheet>
Resultatet blir slik (utdrag):
USE olympic: DROP TABLE IF EXISTS results; CREATE TABLE results( resultid INT PRIMARY KEY AUTO_INCREMENT, place VARCHAR(20), year INT, event VARCHAR(10), name VARCHAR(35), nation VARCHAR(10), result VARCHAR(10) ); INSERT INTO results(place,year,event,name,nation,result) values ( 'Sidney', '2000', '100m', 'Maurice Greene', 'USA', '09.87') , ( 'Sidney', '2000', '100m', 'Obadele Thompson', 'BAR', '10.04') , ..... ( 'Athens', '2004', '400m', 'Michael Blackwood', 'JAM', '45.55') ;
Oppdatering av basen blir som i versjon 1, men det går vesentlig raskere.
DB-oppdatering fra XML
I denne løsningen skriver vi et Pythonskript som både foretar tansformasjonen og oppdaterer databasen. Vi trenger vi to biblioteker for å få til dette:
- Vi benytter oss av biblioteket lxml [1] for å få til en transformasjon fra et Pytonskript. Vi bruker den samme transformasjonen som vi bruket i versjon 2 ovenfor.
- Vi benytter også en kopling mellom MySQL og Python, se MySQL for Python [2]
Pythonkoden blir slik:
import MySQLdb from lxml import etree """ Use lxml and MySQL Transform XML-structure and update database, one record for each result """ XML_FILE='all_results2.xml' XSLT_FILE='dbtrans2.xslt' #---------------------------------- # file IO def getTextFile(filename): try: file=open(filename,'r') intext=file.read() file.close() return intext except: print 'Error reading file ',filename return None #---------------------------------- # connect and execute a sql-request def connectAndExecute(sql): try: myBase=MySQLdb.connect(host=' itstud.hiof.no', user='borres', passwd='hemmelig', db='bsdiverse') myTab=myBase.cursor() myTab.execute(sql) myBase.commit() myBase.close() except MySQLdb.Error, e: print "Error %d: %s" % (e.args[0], e.args[1]) myBase.close() #---------------------------------- # Perform transformation def transform(): xmlTree=etree.parse(XML_FILE) xsltTree=etree.parse(XSLT_FILE) transform=etree.XSLT(xsltTree) resultTree=transform(xmlTree) # and update database sqlsentences=str(resultTree).split(';') #print sqlsentences for sentence in sqlsentences: if len(sentence.strip()) > 0: #print sentence connectAndExecute(sentence) transform()
XML fra database
Vi kan gjøre dette på forskjelige måter. Her er valgt en løsning som bruker et enkelt Pythonskript som leser databasen på tradisjonell måte.
Skriptet er laget slik at det skal kunne handtere vilkårlige olympiader, og det skal kunne handtere andre resulteter enn 100m, 200m og 400m.
import MySQLdb """ Use MySQL Read a database and establish a XML-structur Database table: CREATE TABLE results ( resultid INT PRIMARY KEY AUTO_INCREMENT, place VARCHAR(20), year INT, event VARCHAR(10), name VARCHAR(35), nation VARCHAR(10), result VARCHAR(10) ); """ XML_FILE='made_results.xml' #---------------------------------- # file IO def storeTextFile(filename,txt): try: outfile=open(filename,'w') outfile.write(txt) outfile.close() except: print 'Error writing file ',filename #---------------------------------- # connect and execute a sql-request def connectAndExecute(sql): try: myBase=MySQLdb.connect(host=' itstud.hiof.no', user='borres', passwd='hemmelig', db='olympic') myTab=myBase.cursor() myTab.execute(sql) myBase.commit() myBase.close() return myTab.fetchall() except MySQLdb.Error, e: print "Error %d: %s" % (e.args[0], e.args[1]) myBase.close() #---------------------------------- # build the simple xml-structure: XML_WRAPPER="""<?xml version="1.0" encoding="utf-8"?> <IOC> %s </IOC> """ XML_FRAGMENT=""" <athlet> <name>%s</name> <nation>%s</nation> <result>%s</result> </athlet>""" SQL_SELECT_ATHLETS="""SELECT name,nation,result FROM results WHERE place='%s' AND event='%s';""" SQL_SELECT_GAMES="""SELECT place,year FROM results;""" SQL_SELECT_EVENTS="""SELECT event FROM results;""" GAMES=[] EVENTS=[] # final build def buildXMLTree(): xmlstr='' for game in GAMES: xmlstr+=' <OlympicGame place="%s" year="%s">'%\ (game[0],game[1]) for event in EVENTS: xmlstr+='\n <event dist="%s">'%event results=connectAndExecute(SQL_SELECT_ATHLETS%(game[0],event)) for ix in range(0,len(results)): record=results[ix] xmlstr+=XML_FRAGMENT%(record[0],record[1],record[2]) xmlstr+='</event>\n' xmlstr+='</OlympicGame>\n' return XML_WRAPPER%xmlstr # find all pairs of place,year def prepareGameList(): #return [['Barcelona','1992'],...['Athens','2004']] gamelist=[] results=connectAndExecute(SQL_SELECT_GAMES) for result in results: pair=[result[0],result[1]] if pair in gamelist: continue gamelist.append(pair) return gamelist #find all events def prepareEventList(): # return ['100m','200m','400m'] eventlist=[] results=connectAndExecute(SQL_SELECT_EVENTS) for result in results: event=result[0] if event in eventlist: continue eventlist.append(event) return eventlist GAMES=prepareGameList() EVENTS=prepareEventList() storeTextFile(XML_FILE,buildXMLTree())
Resultatet blir som den opprinnelige resultatfila. resultat
Så kan vi ta denne fila og transformere til et annet format vi måtte være interesserte i, f.eks JSON. Alt dette kan foregå på tjeneren, slik at vi kan bestille JSON fra en database. Vi skal se litt på hvordan vi kan gjøre dette på en ganske smidig og standardisert måte i modulen DB2JSON .