Migration Guide

This guide helps you migrate from direct lynguine calls to server mode for faster repeated access.

Who Should Use Server Mode?

Server mode is beneficial if you:

✅ Call lynguine multiple times (e.g., in loops, batch processing)

✅ Experience slow startup times (~2 seconds per call)

✅ Use subprocesses to call lynguine (like lamd does)

✅ Work in single-user, local development environments

❌ Server mode is NOT needed if you:

Already have a long-running process (e.g., Jupyter kernel)
Only call lynguine once
Need multi-user authentication (use Phase 4 instead)

Migration Steps

Step 1: Understand the Performance Gain

Before (Direct Mode):

# Every call incurs ~2s startup (pandas, numpy, lynguine loading)
for i in range(10):
    df = some_lynguine_call()  # 2s each = 20s total

After (Server Mode):

# First call: ~2s startup
# Subsequent calls: ~10ms each
for i in range(10):
    df = client.read_data(...)  # 10ms each = ~100ms total for 9 calls

Speedup: 156.3x for repeated calls (1.532s → 9.8ms)

Step 2: Install Dependencies

Server mode requires the requests library:

pip install requests

Step 3: Choose Your Migration Strategy

Option A: Manual Server Start (Simple, Full Control)

Best for: Development, testing, understanding the system

Old code (direct mode):

from lynguine.access import io

df = io.read('config.yml', directory='.')

New code (server mode - manual):

from lynguine.client import ServerClient

# 1. Start server in separate terminal:
#    python -m lynguine.server

# 2. Use client instead of direct calls
client = ServerClient('http://127.0.0.1:8765')
df = client.read_data(interface_file='config.yml', directory='.')
client.close()

Option B: Auto-Start (Recommended, Zero Setup)

Best for: Production, CI/CD, seamless integration

Old code (direct mode):

from lynguine.access import io

df = io.read('config.yml', directory='.')

New code (server mode - auto-start):

from lynguine.client import ServerClient

# Client automatically starts server if not running
client = ServerClient(
    server_url='http://127.0.0.1:8765',
    auto_start=True,           # Enable auto-start
    idle_timeout=300,          # Server shuts down after 5min idle
    max_retries=3,             # Retry on failures
    retry_delay=1.0            # Exponential backoff from 1s
)

df = client.read_data(interface_file='config.yml', directory='.')
client.close()  # Server remains running for other clients

Step 4: Update Your Code Patterns

Pattern 1: Single Data Read

Before:

from lynguine.access import io

df = io.read('config.yml', directory='.')

After:

from lynguine.client import ServerClient

client = ServerClient(auto_start=True, idle_timeout=300)
df = client.read_data(interface_file='config.yml', directory='.')
client.close()

Pattern 2: Multiple Reads (The Big Win!)

Before (20s for 10 reads):

from lynguine.access import io

results = []
for file in config_files:
    df = io.read(file)  # 2s each
    results.append(df)

After (2s + 100ms for 10 reads):

from lynguine.client import ServerClient

client = ServerClient(auto_start=True, idle_timeout=300)
results = []
for file in config_files:
    df = client.read_data(interface_file=file)  # 10ms each after first
    results.append(df)
client.close()

Pattern 3: Subprocess Calls (Like `lamd`)

Before:

# Each subprocess incurs full startup cost
python -c "from lynguine.access import io; io.read(...)"  # 2s
python -c "from lynguine.access import io; io.read(...)"  # 2s
python -c "from lynguine.access import io; io.read(...)"  # 2s

After:

# First call starts server, subsequent calls are fast
python -c "from lynguine.client import ServerClient; ServerClient(auto_start=True).read_data(...)"  # 2s first time
python -c "from lynguine.client import ServerClient; ServerClient().read_data(...)"  # 10ms
python -c "from lynguine.client import ServerClient; ServerClient().read_data(...)"  # 10ms

Pattern 4: Context Manager (Clean Teardown)

from lynguine.client import ServerClient

with ServerClient(auto_start=True, idle_timeout=300) as client:
    df1 = client.read_data(interface_file='config1.yml')
    df2 = client.read_data(interface_file='config2.yml')
    df3 = client.read_data(interface_file='config3.yml')
# Client closes automatically, server remains running

Step 5: Configure for Your Use Case

For Development (Persistent Server)

client = ServerClient(
    auto_start=True,
    idle_timeout=0  # Never auto-shutdown (manual stop only)
)

For CI/CD (Auto-Cleanup)

client = ServerClient(
    auto_start=True,
    idle_timeout=60,  # 1-minute timeout for fast cleanup
    max_retries=2     # Fewer retries for faster failure
)

For Production (Robust)

client = ServerClient(
    auto_start=True,
    idle_timeout=300,    # 5-minute timeout
    max_retries=3,       # Standard retry count
    retry_delay=1.0,     # 1s, 2s, 4s retry delays
    timeout=60.0         # 60s request timeout
)

Step 6: Test Your Migration

Verify functionality: Results should be identical to direct mode
Measure performance: Use benchmarks to confirm speedup
Test failure scenarios: Kill server, check auto-restart works
Monitor resources: Check server memory usage over time

# Benchmark comparison
import time
from lynguine.client import ServerClient

client = ServerClient(auto_start=True)

# Warmup (first call starts server)
_ = client.read_data(data_source={'type': 'fake', 'nrows': 10, 'cols': ['name']})

# Measure subsequent calls
start = time.time()
for _ in range(10):
    _ = client.read_data(data_source={'type': 'fake', 'nrows': 10, 'cols': ['name']})
elapsed = time.time() - start

print(f"10 calls: {elapsed:.3f}s ({elapsed/10*1000:.1f}ms per call)")
# Expected: ~100ms total, ~10ms per call

client.close()

Application-Specific Integration

lamd Integration

Before:

# lamd calls lynguine via subprocess repeatedly
subprocess.run(['python', '-c', 'from lynguine.access import io; ...'])

After:

# First call auto-starts server, subsequent calls are fast
subprocess.run(['python', '-c', '''
from lynguine.client import ServerClient
client = ServerClient(auto_start=True, idle_timeout=300)
client.read_data(...)
'''])

referia Integration

Warning

Server mode is NOT recommended for referia because:

referia already runs in a long-running Jupyter kernel
No repeated subprocess calls = no startup overhead
Server mode adds unnecessary complexity

Continue using direct mode in referia.

Rolling Back

If you need to roll back to direct mode:

# Before (server mode):
from lynguine.client import ServerClient
client = ServerClient()
df = client.read_data(interface_file='config.yml')

# After (direct mode):
from lynguine.access import io
df = io.read('config.yml')

No data migration needed - both modes work identically!

Performance Expectations

Scenario	Direct Mode	Server Mode (First)	Server Mode (Subsequent)	Speedup
Single call	1.947s	1.947s	–	1x
10 calls	19.47s	1.947s + 98ms	98ms	198x
100 calls	194.7s	1.947s + 980ms	980ms	~67x

Next Steps

✅ Migrate your code using auto-start
✅ Measure your actual performance gains
✅ Configure idle timeout for your use case
✅ Add retry logic for production robustness
📚 Read Troubleshooting for common issues
📊 Monitor server resources in production

Migration Guide

Who Should Use Server Mode?

Migration Steps

Step 1: Understand the Performance Gain

Step 2: Install Dependencies

Step 3: Choose Your Migration Strategy

Option A: Manual Server Start (Simple, Full Control)

Option B: Auto-Start (Recommended, Zero Setup)

Step 4: Update Your Code Patterns

Pattern 1: Single Data Read

Pattern 2: Multiple Reads (The Big Win!)

Pattern 3: Subprocess Calls (Like lamd)

Pattern 4: Context Manager (Clean Teardown)

Step 5: Configure for Your Use Case

For Development (Persistent Server)

For CI/CD (Auto-Cleanup)

For Production (Robust)

Step 6: Test Your Migration

Application-Specific Integration

lamd Integration

referia Integration

Rolling Back

Performance Expectations

Next Steps

See Also

Pattern 3: Subprocess Calls (Like `lamd`)