Skip to content

Changes to transcript files #187

@mark-dce

Description

@mark-dce

Need to have WUSTL make the following updates to any transcript files:

  • Add TEI namespace - xmlns="http://www.tei-c.org/ns/1.0"
  • Remove duplicate definitions of xml:id - most files have the same identifier defined in the /TEI node and another in the /TEI/text node - make the second unique or remove it
  • Add the source filename at /TEI/teiHeader/fileDesc/sourceDesc/recordingStmt/recording/media/@url
  • Correct timecode stamps - e.g. "00:00:00:33" should be "00:00:33:00" for the 33 second mark
  • Review CCR records - have teiCorpus container node instead of TEI - either needs to be changed to be consistent with LEW and AWP collections, or a separate importer will need to be developed (additional scope).

EXAMPLE FILE EXCERPTS

<?xml version="1.0" encoding="UTF-8" ?> 
<TEI xmlns="http://www.tei-c.org/ns/1.0" xml:id="avi14885.00654.023">  
   <teiHeader>
   ...
     <sourceDesc>
        <recordingStmt>
           <recording type="video" dur="TIME_process">
               <media type="video/mov" url="fma-2-78436-acc-20140424.mov"/>
               ...
          </recording>
          ...
        </recordingStmt>
         ....
      </sourceDesc>
   </teiHeader>
   <text xml:id="avi14885.00654.023-text">
   ...
   </text>
</TEI>

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions