Skip to content Skip to sidebar Skip to footer

Listofdict To Rdf Conversion In Python Targeting Apache Jena Fuseki

To store some data in Apache Jena from python I'd like to have a generic conversion from a list of Dicts to RDF and possibly back on query. For the list of Dict to RDF part I tried

Solution 1:

+ is the special character in HTTP Form encoding for a space but it should only be used in application/x-www-form-urlencoded.

For URIs, use %20 or decide on a replacement character such as _ for space because it looks a bit like a space.

In all these cases, there is not a space character in the URI - there is a +, %20 (three characters) or _. It is encoding, not an escape mechanism.

Solution 2:

The following code at least works and has a correct "round-trip" behavior. The data inserted from a list of Dicts can be retrieved with a corresponding quer. Please comment for more improvements or add a better answer.

If you'd always like to get typedLiterals you can specify this now in the constructor of the Jena wrapper class.

in typed literal mode the unit test insert is:

the types

  • integer
  • decimal

are used for numeric literals for proper "round-trip" behavior.

PREFIX foafo: <http://foafo.bitplan.com/foafo/0.1/>
INSERT DATA {
  foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_name "Elizabeth Alexandra Mary Windsor".
  foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_born "1926-04-21"^^<http://www.w3.org/2001/XMLSchema#date>.foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_numberInLine "0"^^<http://www.w3.org/2001/XMLSchema#integer>.foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_wikidataurl "https://www.wikidata.org/wiki/Q9682".
  foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_age "94.32637220476806"^^<http://www.w3.org/2001/XMLSchema#decimal>.foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_ofAge True.
  foafo:Person_CharlesPrinceofWales foafo:Person_name "Charles, Prince of Wales".
  foafo:Person_CharlesPrinceofWales foafo:Person_born "1948-11-14"^^<http://www.w3.org/2001/XMLSchema#date>.foafo:Person_CharlesPrinceofWales foafo:Person_numberInLine "1"^^<http://www.w3.org/2001/XMLSchema#integer>.foafo:Person_CharlesPrinceofWales foafo:Person_wikidataurl "https://www.wikidata.org/wiki/Q43274".
  foafo:Person_CharlesPrinceofWales foafo:Person_age "71.7578047461618"^^<http://www.w3.org/2001/XMLSchema#decimal>.foafo:Person_CharlesPrinceofWales foafo:Person_ofAge True.
  foafo:Person_GeorgeofCambridge foafo:Person_name "George of Cambridge".
  foafo:Person_GeorgeofCambridge foafo:Person_born "2013-07-22"^^<http://www.w3.org/2001/XMLSchema#date>.foafo:Person_GeorgeofCambridge foafo:Person_numberInLine "3"^^<http://www.w3.org/2001/XMLSchema#integer>.foafo:Person_GeorgeofCambridge foafo:Person_wikidataurl "https://www.wikidata.org/wiki/Q1359041".
  foafo:Person_GeorgeofCambridge foafo:Person_age "7.072013799051315"^^<http://www.w3.org/2001/XMLSchema#decimal>.foafo:Person_GeorgeofCambridge foafo:Person_ofAge False.
  foafo:Person_HarryDukeofSussex foafo:Person_name "Harry Duke of Sussex".
  foafo:Person_HarryDukeofSussex foafo:Person_born "1984-09-15"^^<http://www.w3.org/2001/XMLSchema#date>.foafo:Person_HarryDukeofSussex foafo:Person_numberInLine "5"^^<http://www.w3.org/2001/XMLSchema#integer>.foafo:Person_HarryDukeofSussex foafo:Person_wikidataurl "https://www.wikidata.org/wiki/Q152316".
  foafo:Person_HarryDukeofSussex foafo:Person_age "35.92133993168922"^^<http://www.w3.org/2001/XMLSchema#decimal>.foafo:Person_HarryDukeofSussex foafo:Person_ofAge True.
}

when the literal mode is off type literals are only used for dates:

PREFIX foafo: <http://foafo.bitplan.com/foafo/0.1/>
INSERT DATA {
  foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_name "Elizabeth Alexandra Mary Windsor".
  foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_born "1926-04-21"^^<http://www.w3.org/2001/XMLSchema#date>.foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_numberInLine 0.
  foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_wikidataurl "https://www.wikidata.org/wiki/Q9682".
  foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_age 94.32637220476806.
  foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_ofAge True.
  foafo:Person_CharlesPrinceofWales foafo:Person_name "Charles, Prince of Wales".
  foafo:Person_CharlesPrinceofWales foafo:Person_born "1948-11-14"^^<http://www.w3.org/2001/XMLSchema#date>.foafo:Person_CharlesPrinceofWales foafo:Person_numberInLine 1.
  foafo:Person_CharlesPrinceofWales foafo:Person_wikidataurl "https://www.wikidata.org/wiki/Q43274".
  foafo:Person_CharlesPrinceofWales foafo:Person_age 71.7578047461618.
  foafo:Person_CharlesPrinceofWales foafo:Person_ofAge True.
  foafo:Person_GeorgeofCambridge foafo:Person_name "George of Cambridge".
  foafo:Person_GeorgeofCambridge foafo:Person_born "2013-07-22"^^<http://www.w3.org/2001/XMLSchema#date>.foafo:Person_GeorgeofCambridge foafo:Person_numberInLine 3.
  foafo:Person_GeorgeofCambridge foafo:Person_wikidataurl "https://www.wikidata.org/wiki/Q1359041".
  foafo:Person_GeorgeofCambridge foafo:Person_age 7.072013799051315.
  foafo:Person_GeorgeofCambridge foafo:Person_ofAge False.
  foafo:Person_HarryDukeofSussex foafo:Person_name "Harry Duke of Sussex".
  foafo:Person_HarryDukeofSussex foafo:Person_born "1984-09-15"^^<http://www.w3.org/2001/XMLSchema#date>.foafo:Person_HarryDukeofSussex foafo:Person_numberInLine 5.
  foafo:Person_HarryDukeofSussex foafo:Person_wikidataurl "https://www.wikidata.org/wiki/Q152316".
  foafo:Person_HarryDukeofSussex foafo:Person_age 35.92133993168922.
  foafo:Person_HarryDukeofSussex foafo:Person_ofAge True.

}

testListOfDictInsert

deftestListOfDictInsert(self):
        '''
        test inserting a list of Dicts and retrieving the values again
        using a person based example
        instead of
        https://en.wikipedia.org/wiki/FOAF_(ontology)
        
        we use an object oriented derivate of FOAF with a focus on datatypes
        '''
        listofDicts=[
            {'name': 'Elizabeth Alexandra Mary Windsor', 'born': self.dob('1926-04-21'), 'numberInLine': 0, 'wikidataurl': 'https://www.wikidata.org/wiki/Q9682' },
            {'name': 'Charles, Prince of Wales',         'born': self.dob('1948-11-14'), 'numberInLine': 1, 'wikidataurl': 'https://www.wikidata.org/wiki/Q43274' },
            {'name': 'George of Cambridge',              'born': self.dob('2013-07-22'), 'numberInLine': 3, 'wikidataurl': 'https://www.wikidata.org/wiki/Q1359041'},
            {'name': 'Harry Duke of Sussex',             'born': self.dob('1984-09-15'), 'numberInLine': 5, 'wikidataurl': 'https://www.wikidata.org/wiki/Q152316'}
        ]
        today=date.today()
        for person in listofDicts:
            born=person['born']
            age=(today - born).days / 365.2425
            person['age']=age
            person['ofAge']=age>=18
        typedLiteralModes=[True,False]
        entityType='foafo:Person'
        primaryKey='name'
        prefixes='PREFIX foafo: <http://foafo.bitplan.com/foafo/0.1/>'for typedLiteralMode in typedLiteralModes:
            jena=self.getJena(mode='update',typedLiterals=typedLiteralMode,debug=True)
            errors=jena.insertListOfDicts(listofDicts,entityType,primaryKey,prefixes)
            self.checkErrors(errors)
            
        jena=self.getJena(mode="query")    
        queryString = """
        PREFIX foafo: <http://foafo.bitplan.com/foafo/0.1/>
        SELECT ?name ?born ?numberInLine ?wikidataurl ?ofAge ?age WHERE { 
            ?person foafo:Person_name ?name.
            ?person foafo:Person_born ?born.
            ?person foafo:Person_numberInLine ?numberInLine.
            ?person foafo:Person_wikidataurl ?wikidataurl.
            ?person foafo:Person_ofAge ?ofAge.
            ?person foafo:Person_age ?age. 
        }"""
        personResults=jena.query(queryString)
        self.assertEqual(len(listofDicts),len(personResults))
        personList=jena.asListOfDicts(personResults)   
        for index,person inenumerate(personList):
            print("%d: %s" %(index,person))
        # check the correct round-trip behavior
        self.assertEqual(listofDicts,personList)

insertListOfDicts

definsertListOfDicts(self,listOfDicts,entityType,primaryKey,prefixes):
        '''
        insert the given list of dicts mapping datatypes according to
        https://www.w3.org/TR/xmlschema-2/#built-in-datatypes
        
        mapped from 
        https://docs.python.org/3/library/stdtypes.html
        
        compare to
        https://www.w3.org/2001/sw/rdb2rdf/directGraph/
        http://www.bobdc.com/blog/json2rdf/
        https://www.w3.org/TR/json-ld11-api/#data-round-tripping
        https://stackoverflow.com/questions/29030231/json-to-rdf-xml-file-in-python
        '''
        errors=[]
        insertCommand='%s\nINSERT DATA {\n' % prefixes
        for index,record inenumerate(listOfDicts):
            ifnot primaryKey in record:
                errors.append["missing primary key %s in record %d",index]
            else:    
                primaryValue=record[primaryKey]
                encodedPrimaryValue=self.getLocalName(primaryValue)
                tSubject="%s_%s" %(entityType,encodedPrimaryValue)
                for keyValue in record.items():
                    key,value=keyValue
                    valueType=type(value)
                    if self.debug:
                        print("%s(%s)=%s" % (key,valueType,value))
                    tPredicate="%s_%s" % (entityType,key)
                    tObject=value    
                    if valueType == str:   
                        tObject='"%s"' % value
                    elif valueType==int:
                        if self.typedLiterals:
                            tObject='"%d"^^<http://www.w3.org/2001/XMLSchema#integer>' %value
                        passelif valueType==float:
                        if self.typedLiterals:
                            tObject='"%s"^^<http://www.w3.org/2001/XMLSchema#decimal>' %value
                        passelif valueType==bool:
                        passelif valueType==datetime.date:
                        #if self.typedLiterals:
                        tObject='"%s"^^<http://www.w3.org/2001/XMLSchema#date>' %value
                        passelse:
                        errors.append("can't handle type %s in record %d" % (valueType,index))
                        tObject=Noneif tObject isnotNone:    
                        insertCommand+='  %s %s %s.\n' % (tSubject,tPredicate,tObject)
        insertCommand+="\n}"if self.debug:
            print (insertCommand)
        self.insert(insertCommand)
        return errors

Post a Comment for "Listofdict To Rdf Conversion In Python Targeting Apache Jena Fuseki"