You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
'citation': 'Srivastava, N., Salakhutdinov, R. R., & Hinton, G. E. (2013).'
36
+
'Modeling documents with deep boltzmann machines.'
37
+
'arXiv preprint arXiv:1309.6865.'
38
+
}
39
+
}
29
40
30
41
model_hyperparameters= {
31
42
'LDA': {
@@ -451,3 +462,111 @@
451
462
452
463
l1_ratio (double, optional) – The regularization mixing parameter, with 0 <= l1_ratio <= 1. For l1_ratio = 0 the penalty is an elementwise L2 penalty (aka Frobenius Norm). For l1_ratio = 1 it is an elementwise L1 penalty. For 0 < l1_ratio < 1, the penalty is a combination of L1 and L2.
453
464
"""
465
+
466
+
467
+
468
+
RSM_hyperparameters_info="""
469
+
num_topics (int, default=50) – Number of latent topics (hidden units) in the Replicated Softmax Model.
470
+
471
+
epochs (int, default=5) – Number of training epochs (full passes over the dataset).
472
+
473
+
btsz (int, default=100) – Mini-batch size used during training.
474
+
475
+
lr (float, default=0.01) – Learning rate for parameter updates.
476
+
477
+
momentum (float, default=0.1) – Momentum coefficient (used when train_optimizer='momentum').
478
+
479
+
K (int, default=1) – Number of Gibbs sampling steps for k-step Contrastive Divergence (K-CD).
480
+
481
+
softstart (float, default=0.001) – Scale for random initialization of weights (weights ~ N(0,1)*softstart).
482
+
483
+
decay (float, default=0) – Regularization coefficient. If >0, interaction penalty is applied (L1 or L2).
484
+
485
+
penalty_L1 (bool, default=False) – If True use L1 regularization; otherwise L2 is used.
486
+
487
+
penalty_local (bool, default=False) – If True apply penalty locally per-weight; otherwise apply a global penalty.
488
+
489
+
epochs_per_monitor (int, default=1) – Frequency (in epochs) to record monitoring metrics when monitor=True.
490
+
491
+
monitor (bool, default=False) – If True compute and store log-likelihood / perplexity during training.
492
+
493
+
persistent_cd (bool, default=False) – If True use persistent contrastive divergence (PCD) chains.
494
+
495
+
mean_field_cd (bool, default=True) – If True use mean-field contrastive divergence (mfcd) updates.
496
+
497
+
increase_cd (bool, default=False) – If True use gradual k-step CD (k increases across epochs).
498
+
499
+
increase_speed (float, default=0) – Controls speed of gradual increase of k when increase_cd is True.
500
+
501
+
cd_type (str, default='mfcd') – Type of contrastive-divergence algorithm. Common values: 'mfcd' (mean-field CD), 'kcd' (k-step CD), 'pcd' or 'persistent' (persistent CD), 'gradkcd' (gradual kcd).
502
+
503
+
train_optimizer (str, default='sgd') – Optimizer used for parameter updates. Options include: 'sgd', 'momentum', 'adagrad', 'rmsprop', 'adam', 'full' (full-batch), 'minibatch'.
504
+
505
+
logdtm (bool, default=False) – If True apply log(1+count) transform to the document-term matrix before training.
506
+
507
+
val_dtm (array or None, default=None) – Validation document-term matrix (used when training with partitions).
508
+
509
+
random_state (int or None, default=None) – Seed for numpy RNG for reproducible runs.
510
+
511
+
rms_decay (float, default=0.9) – RMSProp moving-average decay (used if train_optimizer='rmsprop').
512
+
513
+
adam_decay1 (float, default=0.9) – Adam first-moment decay (beta1).
514
+
515
+
adam_decay2 (float, default=0.999) – Adam second-moment decay (beta2).
516
+
"""
517
+
518
+
519
+
520
+
oRSM_hyperparameters_info="""
521
+
num_topics (int, default=50) – Number of latent topics (hidden units) in the Over Replicated Softmax Model.
522
+
523
+
epochs (int, default=5) – Number of training epochs (full passes over the dataset).
524
+
525
+
pretrain_epochs (int, default=1) – Number of initial epochs that run the pretraining (mean-field) phase.
526
+
527
+
btsz (int, default=100) – Mini-batch size used during training.
528
+
529
+
M (int, default=30) – Number of hidden multinomial units in the additional replicated softmax layer (over-replication factor).
530
+
531
+
lr (float, default=0.01) – Learning rate for parameter updates.
532
+
533
+
momentum (float, default=0.1) – Momentum coefficient (used when train_optimizer='momentum').
534
+
535
+
softstart (float, default=0.001) – Scale for random initialization of weights (weights ~ N(0,1)*softstart).
536
+
537
+
decay (float, default=0) – Regularization coefficient. If >0, interaction penalty is applied (L1 or L2).
538
+
539
+
penalty_L1 (bool, default=False) – If True use L1 regularization; otherwise L2 is used.
540
+
541
+
penalty_local (bool, default=False) – If True apply penalty locally per-weight; otherwise apply a global penalty.
542
+
543
+
cd_type (str, default='mfcd') – Type of contrastive-divergence algorithm (common values: 'mfcd' mean-field CD, 'kcd' k-step CD, 'pcd' persistent CD).
544
+
545
+
train_optimizer (str, default='sgd') – Optimizer used for parameter updates. Options include: 'sgd', 'momentum', 'adagrad', 'rmsprop', 'adam'.
546
+
547
+
rms_decay (float, default=0.9) – RMSProp moving-average decay (used if train_optimizer='rmsprop').
548
+
549
+
adam_decay1 (float, default=0.9) – Adam first-moment decay (beta1).
550
+
551
+
adam_decay2 (float, default=0.999) – Adam second-moment decay (beta2).
552
+
553
+
logdtm (bool, default=False) – If True apply log(1+count) transform to the document-term matrix before training.
554
+
555
+
val_dtm (array or None, default=None) – Validation document-term matrix (used when training with partitions).
556
+
557
+
epochs_per_monitor (int, default=1) – Frequency (in epochs) to record monitoring metrics when monitor=True.
558
+
559
+
monitor (bool, default=False) – If True compute and store monitoring metrics (e.g., perplexity) during training.
560
+
561
+
random_state (int or None, default=None) – Seed for numpy RNG for reproducible runs.
562
+
563
+
use_partitions (bool, default=True) – Whether the dataset partitions (train/test) are used (class attribute).
564
+
565
+
softstart (float, default=0.001) – Initial scale for weight initialization.
566
+
567
+
epsilon (float, default=0.01) – Convergence threshold used by mean-field updates (internal training parameter).
568
+
569
+
Notes:
570
+
- The model accepts a document-term matrix (dtm) as training input; many hyperparameters (e.g., M, btsz, lr, optimizer) influence training dynamics and convergence.
571
+
- Pretraining uses a simplified k-CD step (pretrain_epochs) before full training.
0 commit comments