You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: Teuken.yaml
+4-4Lines changed: 4 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -33,7 +33,7 @@ org:
33
33
datasources_basemodel:
34
34
class: partial
35
35
link: https://arxiv.org/pdf/2410.08800
36
-
notes: Dataset described as deriving from the CommonCrawl, but no filtered dataset provided. Either a filtered dataset or a fully reproducible and persistent data pipeline would be warranted.
36
+
notes: Dataset described as deriving from the CommonCrawl, but no filtered dataset provided. Either a filtered dataset or a fully reproducible and persistent data pipeline would be preferred here.
notes: SBATCH script with training code available at fork of Megatron-LM. However, no easily visible and easily navigable repository containing the code used to train the model is available.
56
+
notes: SBATCH script with training code available at fork of Megatron-LM. However, no easily visible and easily navigable repository containing the code used to train the model is available. Making the repository more easily visible would alleviate this.
57
57
58
58
# documentation:
59
59
code:
60
60
class: closed
61
61
link:
62
-
notes: README of Megatron-LM repo containing training code is unchanged from base repo. More elaborate documentation would be warranted.
62
+
notes: "README of containing training code is unchanged from base repo. More elaborate documentation would be warranted. A good example for a good documentation style would be the repository for the OLMo model: https://github.com/allenai/OLMo"
63
63
64
64
hardware_architecture:
65
65
class: open
@@ -84,7 +84,7 @@ modelcard:
84
84
datasheet:
85
85
class: closed
86
86
link:
87
-
notes: No datasheet containing a detailed description of data collection and curation is found attached to a persistent version of the model data, as would be preferred here.
87
+
notes: No datasheet containing a detailed description of data collection and curation is found attached to a persistent version of the model data, as would be preferred here. A persistent version of the filtered data with attached the information in the data preprint at https://arxiv.org/abs/2410.08800 would be sufficient here.
0 commit comments