Understanding Looker's Architecture
LookML, PDTs, and the Database Backend
Looker translates LookML into SQL, pushing computation to the database layer. While this design is powerful, it means performance heavily depends on:
- Efficient LookML modeling (joins, explores, aggregates)
- Proper use of Persistent Derived Tables (PDTs)
- Data warehouse tuning and concurrency handling
Looker Caching and Query Routing
Looker caches results for performance but caches are scoped by user, permission, and explore. Misconfigured caching strategies may result in redundant SQL execution under load.
Common Troubleshooting Scenarios
1. Dashboard Timeouts
Complex dashboards timing out are often caused by:
- Unfiltered cross joins due to LookML model errors
- Excessive row returns without limits or filters
- Slow subqueries or nested PDTs
2. Broken PDTs or Build Failures
PDTs can fail to build when:
- The underlying SQL logic changes upstream
- They exceed memory limits on the warehouse
- Build schedules conflict across environments
3. Query Explosion from Fanouts
Joining fact tables or incorrect one-to-many joins can cause exponential row duplication, overwhelming both Looker and the data warehouse.
Root Cause Diagnostics
1. Enable Development Logging
Use the Query panel and SQL Runner to inspect generated SQL:
Explore > Query > SQL > View SQL
Check for Cartesian joins, unnecessary subselects, or missing filters.
2. Monitor PDT Build Logs
Navigate to Admin > Persistent Derived Tables > Log to inspect error messages. Watch for syntax errors, timeouts, or quota breaches.
3. Use System Activity Dashboards
Looker's internal explores like 'i__looker' offer insights on performance:
Explore i__looker > History > Query Runtime, Errors, Users
4. Database-Side Profiling
Run EXPLAIN plans or use warehouse tools (like BigQuery's Execution Details, Snowflake's Query History) to identify slow stages in SQL execution.
Step-by-Step Fixes
1. Refactor LookML Joins
Always define primary keys. Avoid symmetric joins between large datasets without scoped filters or aggregate awareness.
explore: orders { join: users { type: left_outer sql_on: ${orders.user_id} = ${users.id} ;; relationship: many_to_one } }
2. Optimize PDT Strategy
- Use triggered builds based on freshness, not fixed schedules
- Persist heavy logic outside Looker (materialized views)
- Limit row size and remove unused columns
3. Implement Aggregate Awareness
Use the 'aggregate_table' parameter to reduce scan costs and improve dashboard response:
explore: sales { aggregate_table: monthly_summary { query: { dimensions: [month, region] measures: [total_revenue] } materialization: { type: persistent_table } } }
4. Throttle and Paginate Large Queries
Set row limits in dashboards and force filters on large explores. Avoid SELECT * patterns and ensure indexed fields are used in filters.
5. Align with Warehouse Best Practices
Partition large tables, enforce clustering, and tune concurrency settings. Poor performance is often more about the warehouse than Looker itself.
Best Practices for Enterprise Deployments
- Use Looker's Content Validator to catch broken references before promotion
- Standardize development through Git integration and code reviews
- Monitor query volume and user concurrency via 'i__looker'
- Establish guidelines for LookML join strategies
- Periodically audit PDT build schedules and dependencies
Conclusion
Looker is only as performant as its models and warehouse strategy allow. Deep performance issues often stem from inefficient LookML joins, careless PDT management, or unoptimized SQL generation. By systematically analyzing query behavior, aligning LookML with data architecture, and enforcing modeling best practices, teams can prevent performance bottlenecks and ensure a scalable, trusted analytics platform.
FAQs
1. Why do some dashboards time out while others are fast?
Typically due to inefficient joins, unfiltered explores, or nested subqueries. Use SQL Runner to compare generated queries.
2. How do I debug PDT build errors?
Check the PDT log in the Admin panel. Common issues include SQL errors, resource limits, or warehouse permission changes.
3. Can I prevent fanout issues in Looker?
Yes. Define proper primary keys and use 'relationship' settings like many_to_one. Avoid joining two fact tables unless pre-aggregated.
4. How does caching work in Looker?
Looker caches queries based on user, explore, and filters. Changes in permissions or dimensions can invalidate the cache, triggering full re-runs.
5. What should I do when warehouse queries are slow?
Run EXPLAIN plans or warehouse-native query analyzers. Optimize table structures, enforce partitions, and limit query complexity in LookML.