- NeuroName: Domain-Specific AI Architecture for Creative Name Generation
- π§ What is NeuroName?
- π¬ Why Current LLMs Fail at Creative Naming
- ποΈ Architecture Overview
- 𧬠Key Innovations
- π¦ Installation
- π Quick Start
- ποΈ Training
- π Repository Structure
- π Sound Symbolism Research Basis
- π§ Technical Details
- π License
- π Acknowledgments
- Generated by ML Intern
- Usage
- π§ What is NeuroName?
NeuroName: Domain-Specific AI Architecture for Creative Name Generation
π§ What is NeuroName?
NeuroName is a purpose-built neural architecture for generating creative, novel names for brands, YouTube channels, social media handles, products, and more. Unlike generic LLMs that produce obvious word combinations, NeuroName creates genuinely new words that:
- Sound natural and pronounceable
- Evoke intended meanings without being literal
- Are controllable (length, style, language feel, energy)
- Are truly novel β not existing words or obvious compounds
π¬ Why Current LLMs Fail at Creative Naming
| Problem | Why It Happens | NeuroName Solution |
|---|---|---|
| Too generic | LLMs predict probable tokens from training distribution | Character-level VAE generates outside known distributions |
| Obvious combinations | Token-level = existing word chunks | Char-level latent space enables smooth morphological blending |
| No sound awareness | No phonotactic model | Dedicated Phonotactic Discriminator scores pronounceability |
| Can't be truly novel | Constrained to recombine training tokens | VAE latent interpolation creates genuinely new sequences |
| No fine control | Prompt engineering is imprecise | Energy-based composable attribute control in latent space |
| RLHF kills creativity | Safety alignment β conservative outputs | No RLHF; creativity is the objective function |
ποΈ Architecture Overview
Input: semantic_hints + control_params (length, style, language_feel, energy)
β
βΌ
βββββββββββββββββββββββββββββββ
β Semantic Encoder β β Transformer encodes meaning hints
β (attention-pooled) β
ββββββββββββββββ¬βββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββ
β Conditional Prior β β P(z|semantics, controls) - Gaussian
β Network (ΞΌ, Ο learned) β
ββββββββββββββββ¬βββββββββββββββ
β
βΌ z ~ N(ΞΌ, ΟΒ²)
βββββββββββββββββββββββββββββββ
β Latent Space + EBM β β Energy-based attribute composition
β (ODE-guided sampling) β
ββββββββββββββββ¬βββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββ
β Character Decoder β β Transformer generates char-by-char
β (cross-attends to z) β
ββββββββββββββββ¬βββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββ
β Phonotactic Validator β β CNN+Transformer scores sound quality
ββββββββββββββββ¬βββββββββββββββ
β
βΌ
Generated Name: "Velocix" β
𧬠Key Innovations
1. Character-Level VAE (not token-level)
Operates at individual characters, enabling creation of genuinely novel sequences impossible with subword tokenizers.
2. Phonotactic Discriminator
Learned model of sound combinations (bigrams, trigrams, syllable structure) based on the Bouba-Kiki Effect and cross-linguistic phonotactics. Ensures outputs are pronounceable and pleasant-sounding.
3. Morphological Composition Module
Explicit linguistic word-formation operations as differentiable modules:
- Blending: "breakfast + lunch β brunch" style merging
- Affixation: Meaningful prefix/suffix attachment
- Vowel Harmony: Sound shifting for cohesion
- Clipping + Extension: Shortening with style
4. Energy-Based Composable Control
Multiple attributes (style, length, language feel) composed via energy functions in latent space. Mathematically principled β not prompt hacking.
5. Sound Symbolism Integration
Phoneme-meaning associations baked into the architecture:
- Plosives (b, d, k, t): Power, strength β "Kodak", "TikTok"
- Fricatives (f, s, sh, v): Speed, elegance β "Swift", "Visa"
- Nasals (m, n): Warmth, comfort β "Amazon", "Nintendo"
- Close vowels (i, e): Precision, tech β "Google", "Pixel"
π¦ Installation
pip install torch numpy pyyaml tqdm
git clone https://huggingface.co/asdf98/neuroname
cd neuroname
pip install -e .
π Quick Start
from neuroname import NeuroNameGenerator
# Initialize generator
generator = NeuroNameGenerator()
# Generate brand names with semantic hints
names = generator.generate(
semantic_hints=["speed", "technology", "future"],
style="modern", # modern/classic/playful/techy/organic/elegant/bold/minimal
language_feel="latin", # english/latin/greek/japanese/nordic/spanish/french/abstract
energy="energetic", # calm/neutral/energetic
length_range=(5, 8),
num_names=10,
temperature=0.8
)
print(names)
# ['Velocix', 'Tervon', 'Nexura', 'Fluxen', 'Zyphos', ...]
# Generate YouTube channel names
names = generator.generate(
semantic_hints=["gaming", "adventure", "epic"],
style="playful",
language_feel="english",
energy="energetic",
length_range=(6, 12),
num_names=10
)
# Generate social media handles
names = generator.generate(
semantic_hints=["art", "minimal", "aesthetic"],
style="elegant",
language_feel="french",
energy="calm",
length_range=(4, 8),
num_names=10
)
ποΈ Training
# Train from scratch
python train.py --config configs/default.yaml
# Train with custom data
python train.py --data_path your_names.txt --epochs 100
π Repository Structure
neuroname/
βββ README.md # This file
βββ pyproject.toml # Package configuration
βββ neuroname/
β βββ __init__.py # Package exports
β βββ model.py # Core architecture (VAE + all components)
β βββ generator.py # High-level generation interface
β βββ phonotactics.py # Phonotactic scoring & sound symbolism
β βββ morphology.py # Morphological composition operations
β βββ latent_ops.py # Energy-based latent space control
β βββ data.py # Dataset & data loading utilities
β βββ config.py # Configuration management
βββ train.py # Training script
βββ configs/
β βββ default.yaml # Default training configuration
βββ notebooks/
βββ demo.ipynb # Interactive demonstration
π Sound Symbolism Research Basis
Our architecture is grounded in linguistic research on sound-meaning associations:
| Phoneme Type | Associations | Example Brands |
|---|---|---|
| Voiced plosives (b, g, d) | Strong, bold, grounded | Bose, Google, Dell |
| Voiceless plosives (p, t, k) | Sharp, precise, clean | Paypal, Tesla, Kodak |
| Fricatives (f, v, s, z) | Fast, flowing, futuristic | Visa, Zara, Spotify |
| Nasals (m, n) | Warm, nurturing, smooth | aMazon, Nintendo |
| Liquids (l, r) | Fluid, dynamic, premium | Lexus, Rolex |
| High vowels (i, ee) | Small, quick, technical | Pixel, Wii |
| Low vowels (a, o) | Big, open, powerful | Apple, Volvo |
π§ Technical Details
- Model Size: ~15M parameters (intentionally small β domain-specific, not general)
- Latent Dimension: 128
- Character Vocabulary: 44 chars (lowercase + digits + special)
- Max Name Length: 32 characters
- Training: ELBO loss + phonotactic reward + attribute classification
π License
MIT License - see LICENSE file for details.
π Acknowledgments
Architecture inspired by:
- LatentOps - Composable text controls in latent space
- LlaMaVAE - VAE with LLM decoder
- Bouba-Kiki Effect - Sound symbolism research
- Controllable Text Generation Survey - CTG methods taxonomy
Generated by ML Intern
This model repository was generated by ML Intern, an agent for machine learning research and development on the Hugging Face Hub.
- Try ML Intern: https://smolagents-ml-intern.hf.space
- Source code: https://github.com/huggingface/ml-intern
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "asdf98/neuroname"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
For non-causal architectures, replace AutoModelForCausalLM with the appropriate AutoModel class.