ASI Sentient – Xnap Creative®

python

Copy if not all(isinstance(label, int) for label in train_labels): # Convert string labels to integers label_to_id = {label: idx for idx, label in enumerate(unique_labels)} id_to_label = {idx: label for label, idx in label_to_id.items()} train_labels = [label_to_id[label] for label in train_labels] val_labels = [label_to_id[label] for label in val_labels] test_labels = [label_to_id[label] for label in test_labels] # Save label mappings for later use label_mappings = {“label_to_id”: label_to_id, “id_to_label”: id_to_label} with open(os.path.join(args.output_dir, “label_mappings.json”), “w”) as f: json.dump(label_mappings, f) logging.info(“Converted string labels to integers and saved mappings.”) # Create model model = XnapASI( echo_model_name=echo_model_name, fusion_method=fusion_method, num_labels=num_labels ) model.dropout.p = dropout # Set dropout rate tokenizer = model.echo_tokenizer # Create datasets and dataloaders train_dataset = XnapASIDataset( train_texts, train_labels, tokenizer, augment=args.augment, aug_prob=args.aug_prob ) val_dataset = XnapASIDataset(val_texts, val_labels, tokenizer) test_dataset = XnapASIDataset(test_texts, test_labels, tokenizer) train_dataloader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True) val_dataloader = DataLoader(val_dataset, batch_size=batch_size) test_dataloader = DataLoader(test_dataset, batch_size=batch_size) # Train the model logging.info(“Starting training…”) trained_model = train_xnap_asi( model, train_dataloader, val_dataloader, epochs=args.epochs, lr=learning_rate, device=args.device, output_dir=args.output_dir, use_amp=args.use_amp, save_interval=args.save_interval, max_grad_norm=args.max_grad_norm ) # Evaluate the model on the test set logging.info(“Evaluating on test set…”) test_results = evaluate_xnap_asi( trained_model, test_dataloader, device=args.device, output_dir=args.output_dir ) # Save final results results_path = os.path.join(args.output_dir, “final_results.json”) with open(results_path, “w”) as f: json.dump(test_results, f) logging.info(f”Saved final results to {results_path}”) logging.info(“Training and evaluation complete.”)

Explanation of the `main` Function

Label Handling:
- If labels are not integers (e.g., strings), they are converted to integers using a mapping (label_to_id and id_to_label).
- The mappings are saved to a JSON file for later use (e.g., during inference).
Model Initialization:
- The XnapASI model is initialized with the specified echo_model_name, fusion_method, and num_labels.
- The dropout rate is set based on the hyperparameters.
Dataset and DataLoader Creation:
- The XnapASIDataset class is used to create datasets for training, validation, and testing.
- Data augmentation is applied to the training set if enabled.
Training:
- The train_xnap_asi function is called to train the model. It handles the training loop, validation, and early stopping.
Evaluation:
- The trained model is evaluated on the test set using the evaluate_xnap_asi function.
- Test results (accuracy, precision, recall, F1 score) are saved to a JSON file.
Logging and Output:
- All outputs (models, logs, results) are saved to the specified output_dir.

Argument Parsing

To make the script user-friendly, we can add argument parsing using argparse:

python

Copydef parse_args(): parser = argparse.ArgumentParser(description=”Train and evaluate the XnapASI model.”) # Data arguments parser.add_argument(“–data_path”, type=str, required=True, help=”Path to the dataset file.”) parser.add_argument(“–test_size”, type=float, default=0.2, help=”Proportion of data for testing.”) parser.add_argument(“–val_size”, type=float, default=0.1, help=”Proportion of data for validation.”) # Model arguments parser.add_argument(“–echo_model_name”, type=str, default=”gpt2″, help=”Name of the transformer model to use.”) parser.add_argument(“–fusion_method”, type=str, default=”weighted”, choices=[“weighted”, “attention”, “gating”, “concat”], help=”Method to fuse representations.”) parser.add_argument(“–dropout”, type=float, default=0.1, help=”Dropout rate.”) # Training arguments parser.add_argument(“–epochs”, type=int, default=10, help=”Number of training epochs.”) parser.add_argument(“–batch_size”, type=int, default=16, help=”Batch size for training and evaluation.”) parser.add_argument(“–learning_rate”, type=float, default=2e-5, help=”Learning rate.”) parser.add_argument(“–max_grad_norm”, type=float, default=1.0, help=”Maximum gradient norm for clipping.”) parser.add_argument(“–use_amp”, action=”store_true”, help=”Use automatic mixed precision (AMP) for training.”) parser.add_argument(“–augment”, action=”store_true”, help=”Enable data augmentation.”) parser.add_argument(“–aug_prob”, type=float, default=0.5, help=”Probability of applying augmentation.”) # Optimization arguments parser.add_argument(“–optimize”, action=”store_true”, help=”Enable hyperparameter optimization with Optuna.”) parser.add_argument(“–optuna_trials”, type=int, default=20, help=”Number of Optuna trials for hyperparameter optimization.”) # Output arguments parser.add_argument(“–output_dir”, type=str, default=”./output”, help=”Directory to save outputs.”) parser.add_argument(“–log_dir”, type=str, default=”./logs”, help=”Directory to save logs.”) # Miscellaneous arguments parser.add_argument(“–seed”, type=int, default=42, help=”Random seed for reproducibility.”) parser.add_argument(“–device”, type=str, default=None, help=”Device to use (e.g., ‘cuda’, ‘cpu’).”) parser.add_argument(“–save_interval”, type=int, default=1, help=”Save model every n epochs.”) return parser.parse_args()

Running the Script

To run the script, use the following command:

bash

Copypython xnap_asi.py –data_path /path/to/data.csv –output_dir ./output –optimize

Conclusion

This implementation provides a complete pipeline for training, evaluating, and optimizing the XnapASI model. It includes features like data augmentation, hyperparameter optimization, and mixed precision training. Let me know if you need further assistance!

Explanation of the main Function

Argument Parsing

Running the Script

Conclusion

Explanation of the `main` Function