MWSUG 2015
- Conference Overview
- Conference Wrap-Up
- Conference Invitation
- Important Dates
- Schedule Overview
- Hotel Information
- Registration and Rates
- Conference Committee
- Mobile App & Program
- Conference Content
- Section Descriptions
- Paper Presentations
- Schedule Grid
- Special Events
- Pre-Conf. Training
- In-Conference Training
- SAS Super Demos
- Innovation Area
- Meet the Presenters
- For Presenters
- Call for Papers
- Presenter Resources
- Presenter Mentoring
- Get Involved
- Sponsorship
- Be a Volunteer
- Scholarships
- Junior Professionals
- Students
- Best Paper Winners
Proceedings
MWSUG 2015 Paper Presentations
Paper presentations are the heart of a SAS users group meeting. MWSUG 2015 will feature nearly 100 paper presentations organized into 10 academic sections covering a variety of topics and experience levels.
Note: Content and schedule are subject to change. Last updated 17-Oct-2015.
- Advanced Analytics
- BI / Customer Intelligence
- Beyond the Basics
- Career Development
- Data Visualization and Graphics
- Pharmaceutical Applications
- Posters
- Rapid Fire
- SAS 101
- Tools of the Trade
Advanced Analytics
BI / Customer Intelligence
Beyond the Basics
Career Development
Paper No. | Author(s) | Paper Title (click for abstract) |
CD-01 | Kirk Paul Lafler & Charlie Shipp |
What's Hot, What's Not - Skills for SAS® Professionals |
CD-02 | Tim O'Brien | SAS® Software as an Essential Tool in Statistical Consulting and Research |
CD-03 | Helen Fowler | Preparing for Future Careers in Big Data, Marketing Analytics and Data Warehousing |
CD-04 | John Xu | Career Development Panel Discussion |
Data Visualization and Graphics
Paper No. | Author(s) | Paper Title (click for abstract) |
DV-02 | Ting Sa | A Macro to Easily Generate an interactive Google Map Report |
DV-03 | Yu Daniel Wang | Sales Force Alignment Visualization with SAS® |
DV-04 | Xingrong Zhang | Visualize the Geography of your Business Insights using SAS MAP tools |
DV-05 | Roger Muller | Getting Productive Fast in SAS ODS Graphics -- a " Simple Look-See" Approach |
DV-06 | Jack Sawilowsky | Intuitive Demonstration of Statistics through Data Visualization of Pseudo-Randomly Generated Numbers in R and SAS |
DV-07 | Jeffrey Meyers & Jayawant Mandrekar |
Cutpoint Determination Methods in Survival Analysis using SAS®: Updated %FINDCUT macro |
Pharmaceutical Applications
Paper No. | Author(s) | Paper Title (click for abstract) |
PH-01 | Seungyoung Hwang | Transitions in Depressive Symptoms After 10 Years of Follow-up Using PROC LTA in SAS® and Mplus |
PH-02 | Kalaivani Raghunathan | Automated Project Management of SAS Tasks - Excel Dashboard without Using any Program |
PH-03 | Sara Burns et al. | How to Build a Hierarchical Mixed Model in SAS |
PH-06 | Robin High | Plotting LSMEANS and Differences in Generalized Linear Models with GTL |
PH-07 | Deanna Schreiber-Gregory | Utilizing Propensity Score Analyses to Adjust for Selection Bias: A Study of Adolescent Mental Illness and Substance Use |
PH-08 | Troy Hughes | Ushering SAS Emergency Medicine into the 21st Century: Toward Exception Handling Objectives, Actions, Outcomes, and Comms |
Posters
Paper No. | Author(s) | Paper Title (click for abstract) |
PO-01 | Ting Sa | Keep the Formats When Exporting to Excel |
Rapid Fire
SAS 101
Tools of the Trade
Paper No. | Author(s) | Paper Title (click for abstract) |
TT-01 | Kirk Paul Lafler | Dynamic Dashboards Using Base-SAS® Software |
TT-02 | Mike Tangedal | You Can't Spell 'Assume' without S.A.S. |
TT-03 | Emily Hawkins & Lal Puthenveedu Rajanpillai |
Leveraging Hadoop from the Comfort of SAS |
TT-04 | Kaushal Chaudhary & Deanna Schreiber-Gregory |
Multiple Imputation for Arbitrary Missing Data: SAS vs R |
TT-05 | Kent Phelps & Ronda Phelps |
SAS® Enterprise Guide® Base SAS® Program Nodes ~ Automating Your SAS World With a Dynamic FILENAME Statement, Dynamic Code, and the CALL EXECUTE Command; Your Newest BFF (Best Friends Forever) in SAS |
TT-06 | Mengyu Liu | Using SAS Programs to Conduct Discriminate Analysis |
TT-09 | Helen Fowler & Tho Nguyen |
A Demonstration of SAS Analytics in Teradata |
Abstracts
Advanced Analytics
AA-01 : Can you STOP the guesswork in your marketing budget Allocation?? Marketing Mixed Modeling using SAS® can help!!Delali Agbenyegah, Alliance Data Systems
Monday, 8:00 AM - 8:50 AM, Location: Merchants
Even though marketing is inevitable in every business, each and every year the marketing budget is limited and prudent fund allocations are required to optimize marketing investment. In many businesses, the marketing fund is allocated based on the marketing manager's experience, departmental budget allocation rules and sometimes 'gut feelings' of business leaders. Those traditional ways of budget allocation yield sub optimal results and in many cases leads to money wasting on certain irrelevant marketing efforts. Market Mixed models can be used to understand the effects of marketing activities and identify the key marketing efforts that drive the most sales among a group of competing marketing activities. The results can be used in marketing budget allocation and take out the guess work that typically goes into the budget allocation. In this paper, we illustrate how to develop and implement Market Mixed Modeling using SAS procedures from a practical perspective. Real life challenges of market mixed model development and execution are discussed and several recommendations are provided to overcome some of those challenges.
AA-02 : Insurance Predictive Analytics
Mei Najim, Sedgwick
Monday, 9:00 AM - 9:50 AM, Location: Merchants
Predictive modeling analytics is different across insurance, banking, pharmaceutical, and genetics industries due to different available data sources, regulation, and business objectives although based on the same statistical foundations. For the organizations with rich and large data, some unique challenges are inevitable. This paper is going to introduce a common modeling process with big data which includes data acquisition, data preparation, variable creation, variable selection, model building/fitting, model validation, and model testing in SAS Enterprise Guide and SAS Enterprise Miner. Some successful models in the insurance industry will be introduced. Base SAS, SAS Enterprise Guide, and SAS Enterprise Miner are the main tools and logistic regression model will be mainly used as an example throughout the paper.
AA-03 : Fitting and Evaluating Logistic Regression Models
Bruce Lund, Marketing Associates
Tuesday, 1:00 PM - 1:20 PM, Location: Merchants
Logistic regression models are commonly used in direct marketing and consumer finance applications having large data sets. This paper discusses the fitting and evaluation of logistic regression models in this context. The first topic is the screening and transformation of predictor variables. The second topic is a comparison of two methods of fitting multiple candidate models. The first of these methods is the familiar best subsets approach. Then best subsets is compared to a new method which generates a combination of models produced by backward and forward selection plus the models considered by backward and forward. This second method uses HPLOGISTIC with selection of models by SBC (Schwarz Bayes). The final topic is a discussion of model evaluation statistics to measure predictive accuracy and goodness-of-fit in support of the choice of a final model. The paper uses Base SAS® and SAS/STAT.
AA-04 : Advanced Techniques for Fitting Mixed Models Using SAS/STAT® Software
Kathleen Kiernan
Monday, 11:00 AM - 11:50 AM, Location: Merchants
Fitting mixed models to complicated data, such as data that include multiple sources of variation, can be a daunting task. SAS/STAT® software offers several procedures and approaches for fitting mixed models. This paper provides guidance on how to overcome obstacles that commonly occur when you fit mixed models using the MIXED and GLIMMIX procedures. Examples are used to showcase procedure options and programming techniques that can help you overcome difficult data and modeling situations.
AA-05 : Need Help Finding Something? Why Not Ask Your (Artificial) Community! Applying Collaborative Filtering Recommendation Systems to Increase Engagement and Credit Sales
Benjamin Elbert, Alliance Data
Monday, 1:00 PM - 1:50 PM, Location: Merchants
People have likely been recommending everything from where to eat to whom to meet since the dawn of man, but in the last twenty years we have seen the rise of automated, data driven, recommendation engines (also known as recommender systems, expert systems, etc.). One popular approach to recommending products to potential shoppers is collaborative filtering (CF), which relies on weighting the quantity, propensity, or rating of other shoppers by the similarity of the potential shopper's transactions to those of the community of shoppers (traditionally at an individual level). This traditional implementation of a CF-System has the drawback that new or early tenure shoppers may not have enough informative transactional data to make valuable recommendations. In this paper we derive a model that uses information available to banks and creditors as a means to make valuable recommendations to new or low tenure cardholders in addition to multi-transaction persons. We also demonstrate how to implement a model that leverages large amounts of data in SAS, and provide tips for testing the model to determine optimal parameters. Finally, we discuss how to extend our CF-System to areas in addition to the recommendation challenge.
AA-06 : Base SAS Sentiment Analysis Using Catchprases
Mike Tangedal, Capella University
Monday, 3:00 PM - 3:50 PM, Location: Merchants
The social media revolution has increased the level of opinion-based free-form text available for processing. The first consideration when tackling this text for means of sentiment analysis is first noting what constitutes a valid entry. Base SAS is quite adept at parsing text into individual words via help through Perl functions. Matching these individual words and word stems to credible lookup tables noting parts of speech and happiness rating is now simpler due to emphasis on text mining in the analytics community. Noting the percentage of types of words such as adjectives within text entries provides insight just as noting the percentage of positive or negative words based on the happiness rating. Sarcasm is a difficult challenge that can be mitigated somewhat though notation of affection and negation words. The final key to this method of sentiment analysis is through the building of emotionally charged multi word catchphrases. Once the target audience of the sentiment analysis agrees upon the positive or negative tone of these phrases, an assignment of sentimentality is a measure of matching the number of catchphrases to the text entry and then confirming an agreed upon ratio to confirm negativity or positivity.
AA-07 : Slice and Dice your customers easily by using SAS® Clustering Procedures
Yanping Shen, Alliance Data
Tuesday, 1:30 PM - 2:20 PM, Location: Merchants
With the brighter economy, customer base is expanding at a fast pace for the retail industry in recent years. This growth in customer base coupled with the increasing availability of many data attributes at customer level and the move towards personalized marketing efforts possess a new challenge to customer relationship management. In this paper, we illustrate how to innovatively utilize FASTCLUS and CLUSTER procedures in SAS® to create customer segmentations using large data sets with variables having different units of measurement. This paper concludes by suggesting several business applications that can be developed for each customer segment as well as opportunities to further slice and dice different customer segments to enhance personalized marketing efforts.
AA-08 : Get the highest bangs for your buck using Incremental Response Models in SAS® Enterprise Miner TM
Delali Agbenyegah, Alliance Data Systems
Tuesday, 11:00 AM - 11:50 AM, Location: Merchants
Traditional marketing predictive models target customers who are likely shop, make more trips or spend more. Whiles this approach generally yields better marketing performance over random selection, it can sometimes lead to money wasting on customers who will shop regardless of marketing offers and 'do not disturb' customers who will rather stop shopping if you 'disturb' them with marketing offers.Net lift models are used to identify 'persuadable' customers who have higher likelihood to respond to marketing campaigns and help marketers maximize their return on marketing investments. This paper simplifies the basic concept of Net Lift modeling using real life examples and shows how this can easily be accomplished using SAS Enterprise Miner. The paper concludes with real life challenges of Net Lift modeling and suggests ways to handle some of those challenges as well as recommended situations where Net Lift models work best.
AA-09 : Using SAS for the Longitudinal Analysis of Difference Scores
Brandy Sinco, University of Michigan
Edith Kieffer, University of Michigan School of Social Work
Michael Spencer, University of Michigan School of Social Work
Gloria Palmisano, CHASS Center
Gretchen Piatt, University of Michigan Medical School
Michele Heisler, University of Michigan Medical School
Tuesday, 8:00 AM - 8:50 AM, Location: Merchants
Background: When longitudinal data has little missing baseline data, analysis of difference scores is one method of normalizing the error terms, even if the original outcome variable is non-normal. Adjusting for the baseline value as a covariate, enables estimation of difference scores, with adjustment for the starting value. Objective and Methods: Derive the linear mixed model (LMM) for difference scores, which will include terms for time, treatment group, interaction between time and treatment, and baseline value. Demonstrate how to use the SAS data step to prepare a dataset for longitudinal analysis of difference scores. Present a SAS macro that uses Proc Mixed for analysis of difference scores, with adjustment for the baseline values of treatment groups. Derive the formulas for contrasts between change scores between treatment groups, adjusted for baseline. Show how to convert the contrast equations to SAS Estimate statements. Further, explain how the between-group contrasts can be adjusted for multiple comparisons. The example data will be from a diabetes study with three treatment groups with time points at baseline, 6-months, 12-months, and 18-months. Results. Examples will be presented that show the trajectory of an outcome over time between treatment groups, in table and graphic format. These will include the treatment group improving significantly, in comparison to the control group, and of the treatment group staying the same, while the control group worsened over time. Conclusion. Outcome analysis, based on a LMM on difference scores with baseline adjustment, is an effective analysis technique for longitudinal data.
AA-10 : Cross-Cultural Comparison of the School Factors Affecting Students' Achievement in Mathematical Literacy: Based on the Multilevel Analysis of PISA 2012
Yage Guo, University of Nebraska-Lincoln
Tuesday, 9:00 AM - 9:50 AM, Location: Merchants
Student achievement is a global concern as reflected in recent large-scale standardized assessments. This study compared PISA 2012 student mathematical literacy scores across four countries/regions with varying levels of student performance: Shanghai-China, the United States, Finland and Japan. Sixty-five countries participated in PISA 2012, which measured 15-year-old children's mathematical achievement. The study explored the relationship of principals' perceived levels of leadership, school policy, and educational resources with student attainment of mathematical literacy. School variables were treated as covariates when each effect of principal leadership was interpreted. All variables were included in a multilevel model and analyzed simultaneously. The means and standard deviations of outcome variables and the explanatory and control variables for the model of the study were calculated by including sampling weights and plausible values for mathematical literacy scores. SAS PROC MIXED was used to fit multilevel linear models for the study. The findings indicated that: with students' background controlled, the effect of school educational resources on students' mathematical literacy demonstrated some cultural differences among the four countries. Specifically, class size had a significantly positive effect on students' mathematical literacy in Finland and Japan. There was a negative relationship between student achievement and lack of educational resources. Social, economic, and cultural status showed a positive relationship with mathematical literacy under each of the four different cultural contexts. Results also indicated that students are likely to achieve better if principals perceive that there are no shortages of personnel and equipment.
AA-11 : Shifting Expectations: A practical algorithm for detecting level shifts using robust rolling statistics
Matthew Bates, J.P.Morgan Chase
Monday, 10:30 AM - 10:50 AM, Location: Merchants
In the world of predictive analytics, identifying level shifts in the data is like flying an airplane. The current observed altitude (level) generally contains more relevant information about the near future- and if not give proper attention the flight may end up crashing through a mountain. It is a sound practice of a forecaster to use level shift information to determine the ideal size for the training data, so long as seasonality and/or trending cannot be safely assumed. The proposed method to be discussed is intended for practitioner predictive modelers with a basic understanding of level shifts. All demonstrations are executed using SAS 9.2 and only requires SAS/STAT (SAS/ETS is not needed).
AA-12 : Predictive Modeling Using Artificial Neural Networks in SAS® Enterprise Miner
Kechen Zhao, University of Southern California
Monday, 2:00 PM - 2:20 PM, Location: Merchants
A neural network is a set of connected input/output variables where each connection has a weight with it. Neural networks take non-linear functions of linear combination of input variables. This is a powerful and very general approach for regression and classification, and has been shown to be the best machine learning method on many problems. Neural networks are especially effective in problems with high-noise ratio and settings where prediction without interpretation is the goal. It is nowadays one of the most popular data mining and pattern learning tools frequently used by many companies like Google. SAS Enterprise Miner implements tools for modeling and utilizing neural networks. However, literatures on neural network modeling using SAS Enterprise Miner are limited. This paper is a step-by-step introduction to neural network modeling using SAS Enterprise Miner 13.2. This paper will provide an introduction to the Neural Network node in SAS Enterprise Miner 13.2. It will also address steps in training a neural network, accessing model fit and utilizing a trained neural network to classify outcomes in SAS Enterprise Miner 13.2.
AA-13 : Two-Stage Attribute Match and Its Applications in Loyalty Marketing
Jie Liao, Alliance Data
Yi Cao, Alliance Data
Tuesday, 10:30 AM - 10:50 AM, Location: Merchants
In loyalty marketing, it is particularly challenging to predict customers' shopping behavior with limited information. So we propose a Two-Stage Attribute Match (TSAM) algorithm with the purpose of finding a look-alike community for these customers and making inference on their future shopping behavior. In the presentation, we explore a variety of applications for this algorithm. For example, by using this algorithm, we are able to infer purchasing amount levels for customers' 9 merchant category groups simultaneously. Another example on a client's reward program incrementality analysis is also slightly touched. To make the algorithm accessible in SAS, a macro %TwoStageAM is developed to facilitate its future usage.
BI / Customer Intelligence
BI-01 : Decision Management with DS2Helen Fowler, Teradata
Tho Nguyen, Teradata
Tuesday, 2:00 PM - 2:20 PM, Location: Herndon
We all make tactical and strategic decisions every day. With the presence of big data, are we making the right or the best decisions possible as data volume, velocity and variety continue to grow? As businesses become more targeted, personalized and public, it is imperative to make precise data-driven decisions for regulatory compliance and risk management. Come learn how SAS In-Database Decision Management for Teradata can help you make the best decision possible by integrating SAS and Teradata.
BI-02 : Creating Multi-Sheet Microsoft Excel Workbooks with SAS(r): The Basics and Beyond Part 2.
Vince Delgobbo, SAS
Monday, 1:00 PM - 1:50 PM, Location: Herndon
This presentation explains how to use Base SAS®9 software to create multi-sheet Excel workbooks. You learn step-by-step techniques for quickly and easily creating attractive multi-sheet Excel workbooks that contain your SAS® output using the ExcelXP Output Delivery System (ODS) tagset. The techniques can be used regardless of the platform on which SAS software is installed. You can even use them on a mainframe! Creating and delivering your workbooks on-demand and in real time using SAS server technology is discussed. Although the title is similar to previous presentations by this author, this presentation contains new and revised material not previously presented.
BI-03 : In Search of the LOST CARD
Andrew Kuligowski, HSN
Monday, 2:00 PM - 2:20 PM, Location: Herndon
Everyone who's not here, raise your hand. It's an old joke, but it points out the difficulty of identifying persons or things that are not present. The SAS® System has its own version of this chestnut, the SASLOG message indicating that there are one or more gaps in one's input data: NOTE: LOST CARD. This presentation will focus on the creation and use of ad hocs to explore input data, in order to locate the positions where the input data might be incomplete. The goal will be to identify where the missing data should be, so that you can code around the limitations of your data.
BI-04 : Staying Relevant in a Competitive World: Using the SAS® Output Delivery System to Enhance, Customize, and Render Reports
Chevell Parker
Monday, 3:00 PM - 3:50 PM, Location: Herndon
Technology is always changing. To succeed in this ever-evolving landscape, organizations must embrace the change and look for ways to use it to their advantage. Even standard business tasks such as creating reports are affected by the rapid pace of technology. Reports are key to organizations and their customers. Therefore, it is imperative that organizations employ current technology to provide data in customized and meaningful reports across a variety of media. The SAS® Output Delivery System (ODS) gives you that edge by providing tools that enable you to package, present, and deliver report data in more meaningful ways, across the most popular desktop and mobile devices. To begin, the paper illustrates how to modify styles in your reports using the ODS CSS style engine, which incorporates the use of cascading style sheets (CSS) and the ODS document object model (DOM). You also learn how you can use SAS ODS to customize and generate reports in the body of e-mail messages. Then the paper discusses methods for enhancing reports and rendering them in desktop and mobile browsers by using the HTML and HTML5 ODS destinations. To conclude, the paper demonstrates the use of selected SAS ODS destinations and features in practical, real-world applications.
BI-05 : Introduction to ODS Graphics
Chuck Kincaid, Experis Business Intelligence and Analytics
Monday, 4:00 PM - 4:50 PM, Location: Herndon
This presentation teaches the audience how to use ODS Graphics. Now part of Base SAS®, ODS Graphics are a great way to easily create clear graphics that enable any user to tell their story well. SGPLOT and SGPANEL are two of the procedures that can be used to produce powerful graphics that used to require a lot of work. The core of the procedures is explained, as well as some of the many options available. Furthermore, we explore the ways to combine the individual statements to make more complex graphics that tell the story better. Any user of Base SAS on any platform will find great value in the SAS ODS Graphics procedures.
BI-06 : Introduction to SAS® Data Loader: The Power of Data Transformation in Hadoop
Keith Renison, SAS Institute
Tuesday, 8:00 AM - 8:20 AM, Location: Herndon
Organizations are loading data into Hadoop platforms at an extraordinary rate. However, in order to extract value from these platforms, the data must be prepared for analytic exploit. As the volume of data grows, it becomes increasingly more important to reduce data movement, as well as to leverage the computing power of these distributed systems. This paper provides a cursory overview of SAS® Data Loader, a product specifically aimed at these challenges. We cover the underlying mechanisms of how SAS Data Loader works, as well as how it's used to profile, cleanse, transform, and ultimately prepare data for analytics in Hadoop.
BI-07 : Intermediate ODS Graphics
Chuck Kincaid, Experis Business Intelligence and Analytics
Tuesday, 9:00 AM - 9:50 AM, Location: Herndon
This paper will build on the knowledge gained in the Intro to SAS® ODS Graphics. The capabilities in ODS Graphics grow with every release as both new paradigms and smaller tweaks are introduced. After talking with the ODS developers, a selection of the many wonderful capabilities was selected. This paper will look at that selection of both types of capabilities and provide the reader with more tools for their belt. Visualization of data is an important part of telling the story seen in the data. And while the standards and defaults in ODS Graphics are very well done, sometimes the user has specific nuances for characters in the story or additional plot lines they want to incorporate. Almost any possibility, from drama to comedy to mystery, is available in ODS Graphics if you know how. We will explore tables, annotation and changing attributes, as well as the BLOCK and BUBBLE plots. Any user of Base SAS on any platform will find great value from the SAS ODS Graphics procedures. Some experience with these procedures is assumed, but not required.
BI-08 : Easy Come, Easy Go Interactions between the DATA Step and External Files
Andrew Kuligowski, HSN
Tuesday, 10:30 AM - 11:20 AM, Location: Herndon
Chances are, your raw data was not created within the SAS® System. There is a good likelihood that your data may also need to be packaged and passed along to another non-SAS package. This presentation will provide basic answers to two questions common to new SAS users: " How do I get my data into SAS for analysis? " How do I get my data out of SAS? The focus for this presentation will be on two pairs of DATA step statements: INFILE / INPUT and FILE / PUT. We will discuss syntax and usage, citing various types of files as examples.
BI-09 : Debt Collection Through SAS® Analytics Lens
Karush Jaggi, Oklahoma State University
Thomas Waldschmidt, SquareTwo Financial
Harold Dickerson, SquareTwo Financial
Eric Hayes, SquareTwo Financial
Goutam Chakraborty, Oklahoma State University
Tuesday, 11:30 AM - 11:50 AM, Location: Herndon
Debt Collection! The two words can trigger multiple images in one's mind - mostly harsh. However, let's try and think positive for a moment. In 2013, over $55 billion were past due in the United States. What if all of these were left as is and the fate of credit issuers in the hands of good will payments made by defaulters? Well, not the most sustainable model to say the least. In this situation, debt collection comes in as a tool that is employed at multiple levels of recovery to keep the credit flowing. Ranging from in-house to third party to individual collection efforts, the industry is huge. In the recent past, with financial markets recovering and banks selling less of charged off accounts at higher prices, collections has increasingly become a game of efficient operations backed by solid analytics. This paper takes you in to the back alleys of all the data in there and gives an overview of some ways modeling can be used it to impact collection strategy. SAS® tools like Enterprise Miner and Enterprise Guide are utilized for both data manipulation & modeling. Decision trees results are given more focus to understand what factors make the most impact. Anyone with inclination for analytical decision making is expected to benefit from this paper. Along the way, this paper also gives an idea of how analytics teams today are slowly trying to get the 'buy-in' from other stake holders in the company which surprisingly is one of the most challenging aspects of our job.
BI-10 : SAS® Enterprise Guide® System Design
Jennifer First-Kluge, Systems Seminar Consultants, Inc.
Tuesday, 1:00 PM - 1:50 PM, Location: Herndon
A good system should embody the following characteristics: It is planned, maintainable, flexible, simple, accurate, restartable, reliable, reusable, automated, documented, efficient, modular, and validated. This is true of any system, but how to implement this in SAS Enterprise Guide is a unique endeavor. We will provide a brief overview of these characteristics and then dive deeper into how an Enterprise Guide user should approach developing both ad hoc and production systems.
BI-11 : The Joinless Join ~ The Impossible Dream Come True; Expanding the Power of SAS® Enterprise Guide® in a New Way
Kent Phelps, The SASketeers
Ronda Phelps, The SASketeers
Tuesday, 8:30 AM - 8:50 AM, Location: Herndon
SAS Enterprise Guide can easily combine data from tables or data sets by using a Graphical User Interface (GUI) PROC SQL Join to match on like columns or by using a Base SAS® Program Node DATA Step Merge to match on the same variable name. However, what do you do when tables or data sets do not contain like columns or the same variable name and a Join or Merge cannot be used? We invite you to attend our exciting presentation on the Joinless Join where we teach you how to expand the power of SAS Enterprise Guide in a new way. We will empower you to creatively overcome the limits of a standard Join or Merge. You will learn how to design a Joinless Join based upon dependencies, indirect relationships, or no relationships at all between the tables or data sets. In addition, we will highlight how to use a Joinless Join to prepare unrelated joinless data to be utilized by PROC REPORT in creating a PDF. Come experience the power and the versatility of the Joinless Join to greatly expand your data transformation and analysis toolkit. We look forward to introducing you to the surprising paradox of the Joinless Join.
BI-12 : Using E-Miner to Create Model Documentation and/or Reproducible Research
Rex Pruitt, SAS
Tuesday, 3:00 PM - 3:50 PM, Location: Herndon
Businesses need to automate the documentation of their models and integrate the resulting documentation into a Model Risk Management process. The most processes involve interactions with too many applications in order to produce the documentation necessary for hundreds of models in production. Limited model documentation has resulted in organizations being placed on written agreement with the Federal Reserve and/or Consumer Financial Protection Bureau (CFPB). This has led to fines into the billions of dollars. SAS Software provides a nice solution to this model risk governance issue. Specifically, Enterprise Miner will support all of the necessary functionality to support the required model documentation to satisfy governance requirements. The key features that support Model Documentation are& Reporter node " Uses SAS Output Delivery System to create a PDF or RTF of a pro¬cess flow. " Helps document the analysis pro¬cess and facilitate results sharing. " Document can be saved and included in SAS Enterprise Miner results packages. " Includes image of the process flow diagram. " User-defined notes entry. Reproducible Research " Build more, better models faster. " Provides XML diagram exchange. " Reuse diagrams as templates for other projects or users. " Directly load a specific data min¬ing project or diagram, or choose from a Project Navigator tree that contains the most recent projects or diagrams. Product(s) Used: Enterprise Miner v13.2, Enterprise Guide v6.1, and Base SAS v9.4 SAS User Skill Level: Intermediate to Advanced
Beyond the Basics
BB-01 : Five Little Known, But Highly Valuable and Widely Usable, PROC SQL Programming TechniquesKirk Paul Lafler, Software Intelligence Corporation
Tuesday, 8:00 AM - 8:50 AM, Location: Hill
The SQL Procedure contains a number of powerful and elegant language features for SQL users. This presentation highlights five little known, but highly valuable and widely usable, topics that will help users harness the power of the SQL procedure. Topics include using PROC SQL to identify FIRST.row, LAST.row and Between.rows in BY-group processing; constructing and searching the contents of a value-list macro variable for a specific value; data validation operations; data summary operations to process down rows and across columns; and using the MSGLEVEL= system option and _METHOD SQL option to capture information into the processes during query evaluation, the algorithm selected and used by the optimizer when processing a query, testing and debugging operations, and other processes.
BB-02 : GreenSpace: A Macro to Improve a SAS Data Set Footprint
Brian Varney, Experis
Monday, 9:00 AM - 9:20 AM, Location: Hill
SAS programs can be very I/O intensive. SAS Data Sets with inappropriate variable attributes can degrade the performance of SAS programs. Using SAS compression offers some relief but does not eliminate the issue of inappropriately defined SAS variables. This paper intends to examine the problems inappropriate SAS variable attributes cause as well as a macro to tackle the problem of minimizing the footprint of a SAS Data Set.
BB-03 : Just passing through... Or are you? Determine when SQL Pass-Through occurs to optimize your queries
Misty Johnson, State of WI-DHS
Monday, 11:00 AM - 11:20 AM, Location: Hill
SAS/ACCESS® has two recommended methods for accessing data within a relational database management system (DBMS), namely, the SAS/ACCESS LIBNAME interface engine and the Structured Query Language (SQL) Pass-Through Facility. This paper describes the use of the open database connectivity (ODBC) LIBNAME engine with 9.3 SAS code that does and does not invoke implicit SQL pass-through and its effect on run time. Also described is the use of the system options DEBUG and SASTRACE to determine if implicit SQL pass-through occurred, what triggers implicit SQL pass-through and the potential time savings. Knowledge of these methods, their triggers, and tracking options enables the intermediate SAS programmer to select the most efficient coding strategy.
BB-04 : Securing Software Portability through Self-Extracting, Self-Spawning, Spontaneously Combustible Code
Troy Hughes, No Affiliation
Tuesday, 2:00 PM - 2:20 PM, Location: Hill
Spontaneous combustion is defined as combustion that occurs without an external ignition source. With the right combination of fire tetrahedron componentsincluding fuel, oxidizer, heat, and chemical reactionit forms a deadly yet awe-inspiring phenomenon. One remarkable aspect of spontaneous combustion is its portabilityan old rag soaked in linseed oil can ignite a garage, basement, attic, or other enclosed space with no other flame present. Combustion, on the other hand, typically requires a fire source, such as a match, flint, or spark plugs in the case of combustion engines. SAS code as well often requires a "spark" the first time it is run, including when existing code is imported to a new server or environment. SAS libraries, folder structures, configuration files, control tables, and other artifacts may be required for program execution yet not exist, causing sometimes extensive work and rework by developers to create these components. Thus, a more portable and reusable solution is self-extracting code in which the required infrastructure is validated and, where components do not exist, they are created automatically. This text describes methods that increase the portability of code to new environments, by demonstrating self-extracting, self-spawning, spontaneously combustible SAS code.
BB-05 : Retrieving Survey Data using the Qualtrics REST API with SAS
Katie Tanner, Capella University
Monday, 11:30 AM - 11:50 AM, Location: Hill
Qualtrics Research Suite is a powerful tool for collecting data online. It features a Representational State Transfer (REST) Application Programming Interface that allows for other programs to interact with the Qualtrics system. Using the HTTP procedure first introduced in SAS 9.2, SAS has the ability to interface directly with the Qualtrics system. Programs can be easily created to seamlessly transfer survey data from the data collection tool in to SAS for further analysis and manipulation. This paper will walk through a process for accessing the Qualtrics API and storing the resulting data in a SAS dataset. If the survey remains unchanged, this process could run automatically in batch mode to provide continuous update of survey information for reporting and archive purposes.
BB-06 : Color, Rank, Count, Name; Controlling it all in PROC REPORT
Art Carpenter, CA Occidental Consultants
Monday, 1:30 PM - 2:20 PM, Location: Hill
Managing and coordinating various aspects of a report can be challenging. This is especially true when the structure and composition of the report is data driven. For complex reports the inclusion of attributes such as color, labeling, and the ordering of items complicates the coding process. Fortunately we have some powerful reporting tools in SAS® that allow the process to be automated to a great extent. In the example presented in this paper we are tasked with generating an EXCEL® spreadsheet that ranks types of injuries within age groups. A given injury type is to receive a constant color regardless of its rank and the labeling is to include not only the injury label, but the actual count as well. Of course the user needs to be able to control such things as the age groups, color selection and order, and number of desired ranks.
BB-07 : Controlling Colors by Name; Selecting, Ordering, and Using Colors for Your Viewing Pleasure
Art Carpenter, CA Occidental Consultants
Monday, 1:00 PM - 1:20 PM, Location: Hill
Within SAS® literally millions of colors are available for use in our charts, graphs, and reports. We can name these colors using techniques which include color wheels, RGB (Red, Green, Blue) HEX codes, and HLS (Hue, Lightness, Saturation) HEX codes. But sometimes I just want to use a color by name. When I want purple, I want to be able to ask for purple not CX703070 or H03C5066. But am I limiting myself to just one purple? What about light purple or pinkish purple. Do those colors have names or must I use the codes. It turns out that they do have names. Names that we can use. Names that we can select, names that we can order, names that we can use to build our graphs and reports. This paper will show you how to gather color names and manipulate them so that you can take advantage of your favorite purple; be it 'purple', 'grayish purple', 'vivid purple', or 'pale purplish blue'.
BB-08 : Table Lookup Techniques: From the Basics to the Innovative
Art Carpenter, CA Occidental Consultants
Tuesday, 9:00 AM - 9:50 AM, Location: Hill
One of the more commonly needed operations within SAS® programming is to determine the value of one variable based on the value of another. A series of techniques and tools have evolved over the years to make the matching of these values go faster, smoother, and easier. A majority of these techniques require operations such as sorting, searching, and comparing. As it turns out, these types of techniques are some of the more computationally intensive, and consequently an understanding of the operations involved and a careful selection of the specific technique can often save the user a substantial amount of computing resources. Many of the more advanced techniques can require substantially fewer resources. It is incumbent on the user to have a broad understanding of the issues involved and a more detailed understanding of the solutions available. Even if you do not currently have a BIG data problem, you should at the very least have a basic knowledge of the kinds of techniques that are available for your use.
BB-09 : Data Labs with SAS and Teradata: Value, Purpose and Best Practices
Helen Fowler, Teradata
Tho Nguyen, Teradata
Tuesday, 11:00 AM - 11:50 AM, Location: Hill
A data lab also called a 'play pen' or 'sand box' is an area to explore and examine ideas and possibilities by combining new data with existing data to create experimental designs and ad-hoc queries without interrupting the production environment. A Teradata data lab with SAS that provides SAS users immediate access to critical data for exploration and discovery. It is an environment that enables agile in-database analytics by simplifying the provisioning and management of analytic workspace within the production data warehouse. By allocating that space, it provides data lab users easy access to all of the data without moving or duplicating the data. Come learn how SAS and Teradata are integrated in the data lab, hear some best practices and use cases from our joint customers.
BB-10 : An Autoexec Companion, Allocating Location Names during Startup
Ronald Fehd, Stakana Analytics
Monday, 9:30 AM - 9:50 AM, Location: Hill
Like other computer languages SAS(R) software provides a method to automatically execute statements during startup of a program or session. This paper examines the names of locations chosen in the filename and libname statements and the placement of those names in options that enable all programs in a project standardized access to format and macro catalogs, data sets of function definitions and folders containing reusable programs and macros. It also shows the use of the global symbol table to provide variables for document design. The purpose of this paper is to examine the default values of options, suggest naming conventions where missing, and provide both an example autoexec and a program to test it.
BB-11 : Do we need Macros? An Essay on the Theory of Application Development
Ronald Fehd, Stakana Analytics
Tuesday, 10:30 AM - 10:50 AM, Location: Hill
This paper examines the theoretical steps of applications development (ApDev) of routines and subroutines. It compares and contrasts the benefits of using the \%include statement versus macros. It examines the methods of calling subroutines, e.g., sql, call execute and macro loops. The purpose of this paper is to highlight the benefits of using macros to support unit and integration testing, and searching and finding issues during maintenance.
BB-12 : Wow you did that with SAS Stored Processes?
David Mitchell, Solution Design Team
Rick Trojan, Owner
Monday, 3:00 PM - 3:50 PM, Location: Hill
In this paper you will be introduced to advanced SAS 9.4 Stored Process functions developed with Enterprise Guide 7.1 and surfaced thru HTML pages. The application design is based on SAS 9.4 client server architecture featuring a discussion of integration with metadata objects in the SAS Management Console and external databases. The audience will also learn practical techniques used to create stored processes that will cover: * Macro Code * Stored Process Linking * Prompt Dependencies * Hidden Prompts * Prompt Groupings * Dynamic Prompting * Formats and Outputs - email and pdf This application is a set of linked stored processes that allow tracking of internal customer contracts. The application integrates several SAS tables driven by stored processes to: Maintaining a customer database Create projects from contracts Create work orders Allow posting of time spent for consultants Create output for a multitude of reports, emails and PDF files. Special tables for maintaining unique keys Tables to maintain fairly static data History Files for all project work completed Special Tables for Stored Processes that maintain Tables and Reports
BB-13 : Tips and Tricks for getting the most out of the SAS Macro facility
Scott Miller, Wells Fargo
Monday, 4:00 PM - 4:20 PM, Location: Hill
The Base SAS macro facility is a powerful tool to enable your SAS programs to work dynamically with data. However the richness of this tool can be very confusing, and even experienced SAS coders can be mystified by macros. Learn a variety of ways to improve your coding technique with SAS macro programs and macro variables.
BB-16 : Cleaning Dirty Data with Just a Handful of SAS Functions
Ben Cochran, The Bedford Group
Tuesday, 1:00 PM - 1:50 PM, Location: Hill
On occasions, SAS users might find themselves in the position where they need to do some data cleaning. Since dirty data is a very common occurrence, they may need to do this quite often. This presentation looks at some functions that can help clean and scrub the data. Specifically, the COMPRESS, TRANSLATE, LENGTH, INDEXC as well as other functions are highlighted in this presentation.
BB-17 : When Reliable Programs Fail: Designing for Timely, Efficient, Push-Button Recovery
Troy Hughes, No Affiliation
Tuesday, 3:00 PM - 3:50 PM, Location: Hill
Software quality comprises a combination of both functional and performance requirements that together specify not only what software should accomplish, but also how well it should accomplish it. Recoverabilitya common performance objectiverepresents the timeliness and efficiency with which software or a system can resume functioning following a catastrophic failure. Thus, requirements for high availability software often specify the recovery time objective (RTO), or the maximum amount of time that software may be down following an unplanned failure or a planned outage. While systems demanding high or near perfect availability will require redundant hardware, network, and additional infrastructure, software too must facilitate rapid recovery. And, in environments in which system or hardware redundancy is infeasible, recoverability only can be improved through effective software development practices. Because even the most robust code can fail under duress or due to unavoidable or unpredictable circumstances, software reliability must incorporate recoverability principles and methods. This text introduces the TEACH mnemonic that describes guiding principles that software recovery should be timely, efficient, autonomous, constant, and harmless. Moreover, the text introduces the SPICIER mnemonic that describes discrete phases in the recovery period, each of which can benefit from and be optimized with TEACH principles. Software failure is inevitable but negative impacts can be minimized through SAS® development best practices.
BB-18 : Beyond a One-Stoplight Town: A Base SAS Solution to Preventing Data Access Collisions through the Detection, Deployment, Monitoring, and Optimization of Shared and Exclusive File Locks
Troy Hughes, No Affiliation
Monday, 10:30 AM - 10:50 AM, Location: Hill
The LOCKITDOWN macro, introduced at WUSS in 2014, both detects and prevents data access collisions that occur when two or more SAS processes or users simultaneously attempt to access the same SAS data set. With the implementation of LOCKITDOWN, code reliability and robustness can be greatly improved through the elimination of this common source of process failure. And, as processes patiently wait for each other and play nicely without developer intervention or supervision, code autonomy and automation can be achieved. This autonomy, however, can hide inefficiencies that are created when processes repeatedly vie for data set access and are forced to wait for each other. This text introduces an expanded LOCKANDTRACK macro that includes all functionality of LOCKITDOWN and additionally tracks all successful and unsuccessful file lock attempts through a unified control table. HTML reports demonstrate unsuccessful and delayed lock attempts, elucidating to developers where potential bottlenecks exist and where potential efficiencies can be gained.
BB-19 : Getting Started With the SAS Add-in For Microsoft Office Product
Ben Cochran, The Bedford Group
Monday, 8:00 AM - 8:50 AM, Location: Hill
If you like living in the MicroSoft world, you can still use the power of SAS to access data, generate reports and do all kinds of SAS things. This paper looks at some of the SAS things you can do while working in Microsoft Word, Excel, or Powerpoint.
Career Development
CD-01 : What's Hot, What's Not - Skills for SAS® ProfessionalsKirk Paul Lafler, Software Intelligence Corporation
Charlie Shipp, Consider Consulting Corporation
Monday, 8:00 AM - 8:50 AM, Location: Herndon
As a new generation of SAS® user emerges, current and prior generations of users have an extensive array of procedures, programming tools, approaches and techniques to choose from. This presentation identifies and explores the areas that are hot and not-so-hot in the world of the professional SAS user. Topics include Enterprise Guide, PROC SQL, PROC REPORT, Macro Language, DATA step programming techniques such as arrays and hash, SAS University Edition software, support.sas.com, sasCommunity.org®, LexJansen.com, JMP®, and Output Delivery System (ODS).
CD-02 : SAS® Software as an Essential Tool in Statistical Consulting and Research
Tim O'Brien, Loyola University Chicago
Monday, 9:00 AM - 9:50 AM, Location: Herndon
Basic courses in applied biostatistics, statistical methods, design and regression focus on hypothesis testing and estimation for linear and nonlinear models by providing students and decision-makers the methodology (i.e., test statistics and confidence intervals) to reach conclusions by narrowly focusing on individual t-tests and global F-tests. These courses leave students and managers with an overly simplistic view of how informed statistical decisions are made in practice. This paper focuses on the more recent pedagogical ideas of exposing students to underlying likelihood methods and treating these specific (t- and F-) tests as special cases embedded in this larger structure. Key to this better decision-making process is powerful statistical software, and our focus is here on the use of the NLMIXED and IML procedures available in SAS® software to provide the means to make some of these important decisions. This approach enables students and managers to pose and examine more meaningful queries. For example, the techniques discussed here allow practitioners to focus on the estimation of important model parameters in the presence of serially correlated errors rather than on the detection of the exact time-series error structure. Numerous additional practical examples of the applicability of likelihood methods are provided and discussed; specifically, the provided illustrations include novel approaches useful in statistical modelling, drug synergy and relative potency. KEY WORDS: decision-making; likelihood; modelling; statistical education.
CD-03 : Preparing for Future Careers in Big Data, Marketing Analytics and Data Warehousing
Helen Fowler, Teradata
Monday, 10:30 AM - 10:50 AM, Location: Herndon
A quick and current outlook for the future of careers in big data, marketing analytics and data warehousing. Examples of successful decisions and tips will be provided.
CD-04 : Career Development Panel Discussion
John Xu, First Consulting
Monday, 11:00 AM - 11:50 AM, Location: Herndon
The future of SAS Professionals remains bright. Come, listen and learn from our most successful leaders on how to enhance your career opportunities and prepare for the future. You can ask questions about working in corporations, academia, independently and others.
Data Visualization and Graphics
DV-02 : A Macro to Easily Generate an interactive Google Map ReportTing Sa, Cincinnati Children's Hospital Medical Center
Monday, 1:30 PM - 1:50 PM, Location: Washington City
In this paper, a SAS macro is introduced to generate an interactive Google Map report. Using this macro, you can mark, or drop a pin on any location on a Google map by longitude and latitude coordinates. This paper also provides several ways you can find the longitude and latitude coordinates. Once a location is marked on the map, users can further use the mouse to hover over, or click the pin icon to display a pop out dialogue box with additional information about that location. The information shown in the dialogue box can be any multimedia information, such as plain text, images, videos, URL, email links, etc. The report generated by this macro retains all the functionalities of the Google map, allowing you to zoom in, zoom out, or move the map in the report, show the map in satellite mode, etc. The macro also has capability for you to display different styles of pin icons on the map. The SAS user does not need prior knowledge or expertise in any website programming language to use this macro. The SAS user only needs to prepare the input data, call the macro, and the Google map report will be generated.
DV-03 : Sales Force Alignment Visualization with SAS®
Yu Daniel Wang, Experis
Monday, 2:00 PM - 2:20 PM, Location: Washington City
SAS 9.4, OpenStreetMap(OSM) and JAVA APPLET provide tools to generate professional Google like maps. The zip code boundary data files from U.S. Census Bureau can be freely downloaded and imported into SAS by PROC DATAIMPORT. PROC GEOCODE, with the STREET method, can get the longitude and latitude of a street level address from the data files USM, USS, and USP downloaded from the SAS Maps Online website. A dataset of sales force alignment is created after running the sales force optimization model implemented with SAS OPTMODEL. This paper demonstrates the use of all these datasets with SAS to exhibit sales force alignment and target locations on the Google like maps with cities, highways, roads, bodies of water and forests in the background. Each alignment defined area, territory, has its own color. Each territory ID is labelled at its 'center' location. Each sales people office location is marked with his/her name underneath. The boundary of each zip code in a territory is displayed. Each zip code and the number of targets in the zip code are labelled. Different targets could be at the same location. Each target location is dotted with different colors to reflect the different number range of targets at the same address.
DV-04 : Visualize the Geography of your Business Insights using SAS MAP tools
Xingrong Zhang, Alliance Data Retail
Monday, 3:00 PM - 3:20 PM, Location: Washington City
Illustrating business insights on maps do not only give analysts easier ways to show different findings across different regions but also give business leaders visual revelations of critical business insights to aid their planning and decision making. The GMAP procedure in SAS allows users to show data geographically. This paper will demonstrate how to present business insights using PROC GMAP with real life examples and show how additional map features can be added to SAS maps to make it visually stimulating by implementing annotated datasets. Sample code and a macro for map cosmetics will be provided.
DV-05 : Getting Productive Fast in SAS ODS Graphics -- a " Simple Look-See" Approach
Roger Muller, Data To Events, Inc
Monday, 3:30 PM - 3:50 PM, Location: Washington City
SAS ODS Graphics started appearing in version 9.2 of SAS. Some statistical capabilities with graphics were introduced with selected procedures in an earlier version. When first starting to use these tools, the traditional SAS/Graph user may come upon some very significant challenges in learning the new way to do things. This is further complicated by the lack of simple demonstrations of capabilities. Most graphs in training materials and publications are of rather complicated more difficult graphs that while useful, are not good teaching examples. This paper contains many examples of very simple ways to get very simple things accomplished. Over 20 different graphs will be developed using only a few lines of code each using data from the SASHELP.CLASS and SASHELP.CARS datasets. The use of Proc SGplot and Proc SGPanel will be shown. In addition, the paper addresses those situations where the user must alternatively use a combination of Proc Template with Proc SGRender to accomplish the task. The emphasis on this paper is simplicity in the learning process. Users will be able to take the included code and run it immediately on their personal machines as the data is included with SAS installation.
DV-06 : Intuitive Demonstration of Statistics through Data Visualization of Pseudo-Randomly Generated Numbers in R and SAS
Jack Sawilowsky, Union Pacific Railroad
Monday, 4:00 PM - 4:20 PM, Location: Washington City
Statistics training courses are offered to employees in a corporate environment. Monte Carlo simulations are created as teaching devices for live demonstrations of statistical concepts. These results are then coded into data visualizations to aid in intuitive understanding by the audience. Coding is done in both SAS and R, and a comparison is made to show how both platforms are capable of performing the required tasks. Both code and output plots are presented.
DV-07 : Cutpoint Determination Methods in Survival Analysis using SAS®: Updated %FINDCUT macro
Jeffrey Meyers, Mayo Clinic
Jayawant Mandrekar, Mayo Clinic
Monday, 1:00 PM - 1:20 PM, Location: Washington City
Statistical analysis that uses data from clinical or epidemiological studies, include continuous variables such as patient's age, blood pressure, and various biomarkers. Over the years there has been increase in studies that focus on assessing associations between biomarkers and disease of interest. Many of the biomarkers are measured as continuous variables. Investigators seek to identify the possible cutpoint to classify patients as high risk versus low risk based on the value of the biomarker. Several data-oriented techniques such as median and upper quartile, and outcome-oriented techniques based on score, Wald and likelihood ratio tests are commonly used in the literature. Contal and O'Quigley (1999) presented a technique that used log rank test statistic in order to estimate the cutpoint. Their method was computationally intensive and hence was overlooked due to the unavailability of built in options in standard statistical software. In 2003, we had provided the %FINDCUT macro that used Contal and O'Quigley's approach to identify a cutpoint when the outcome of interest was measured as time to event. Over the past decade demand for this macro has continued to grow that has led us to consider updating the %FINDCUT macro to incorporate new tools and procedures from SAS such as array processing, Graph Template Language, and the REPORT procedure. New and updated features will include: results presented in a much cleaner report format, user specified cut points, macro parameter error checking, temporary data set clean-up, preserving current option settings, and increased processing speed. We intend to present the utility and added options of the revised %FINDCUT macro using a real life dataset. In addition, we will critically compare this method with some of the existing methods and discuss the use and misuse of categorizing a continuous covariate.
Pharmaceutical Applications
PH-01 : Transitions in Depressive Symptoms After 10 Years of Follow-up Using PROC LTA in SAS® and MplusSeungyoung Hwang, Johns Hopkins Bloomberg School of Public Health
Monday, 8:30 AM - 8:50 AM, Location: Cozzens
PROC LTA is the most popular and powerful SAS procedure for latent transition analysis used throughout a wide variety of scientific disciplines. However, PROC LTA does not provide standard errors of the parameter estimates and thus constructing 95% confidence intervals around estimates is not possible. In this paper, the author shows how to examine transitions in latent statuses of depressive symptoms after 10 years of follow-up using PROC LTA in SAS. The author then examines whether clinical characteristics predict membership in the different statuses and transitions between latent statuses over time using both SAS and Mplus. Mplus programming code is provided to compute standard errors of the parameter estimates. The dataset used is based on the Baltimore Epidemiologic Catchment Area Study. This paper gently guides SAS and Mplus userseven those with limited experience in statistics or who have never used these softwarethrough a step-by-step approach to using SAS and Mplus for latent transition analysis, and gives advice on how to interpret the results. This paper is suited to students who are beginning their study of social and behavioral health sciences and to professors and research professionals who are conducting research in the fields in epidemiology, clinical psychology, or health services research.
PH-02 : Automated Project Management of SAS Tasks - Excel Dashboard without Using any Program
Kalaivani Raghunathan, Quartesian Clinical Research Pvt Ltd
Monday, 9:00 AM - 9:20 AM, Location: Cozzens
Have you ever imagined having a fully automated, simple and live summary report of clinical SAS programming project tracker without running any program? Yes. This paper is going to explain how your imagination can become true. The day of Clinical programmers or statisticians starts with SAS and ends with SAS since it is the one and only language widely used in clinical trial data analysis. Project management is the process and activity of planning, organizing, motivating, and controlling resources to achieve specific set of goals or requirement as per Wikipedia. Hence it becomes very necessary to manage plenty of SAS programs developed during the trail to generate and validate the datasets, tables, listings and figures (DTLFs). This SAS programming management is mostly done in Microsoft Excel. This project document lists out the DTLFs to be generated, assigned programmer name, QC programmer name, programming status like completed or ongoing and comments either by programmer or QC programmer and so on. This paper illustrates how to create a template for the report as per your requirement which is one time task and how to make it a dynamic report. Whenever the tracker is updated, this summary information will be refreshed automatically without running any program when the excel file is opened. This automation is achieved using the pivot tables in excel.
PH-03 : How to Build a Hierarchical Mixed Model in SAS
Sara Burns, Washington University in St. Louis
Eric Novak, Washington University School of Medicine
Amit Amin, Washington University School of Medicine
Monday, 9:30 AM - 9:50 AM, Location: Cozzens
Mixed models are characterized as containing both fixed and random effects. Hierarchical models, also known as multi-level models, share a defining feature of having individual observations grouped in some way. These models allow us to analyze the data on individuals nested within hierarchies (e.g., patients within hospitals, students within schools) while accounting for both the fixed and random effects. Hierarchical mixed models have been widely used in educational and behavioral research. Mixed models have been applied to healthcare data because they can adequately handle clustered data as well as repeated measures data. They are particularly useful in healthcare research because we often want to account for the variation across hospitals. Through an applied example, this paper will illustrate how SAS PROC MIXED can be utilized to build hierarchical mixed models. There are several options and coding techniques that can be helpful in ensuring that the hierarchical mixed model will run smoothly. This paper will present a real example of how to utilize SAS PROC MIXED to model the cost of early readmission after percutaneous coronary intervention. The appropriate application of the RANDOM statement to account for hospital as a higher level unit will be shown as well as the LS MEANS statement to obtain cost estimates. The paper also shows some strategies for reviewing model diagnostics.
PH-06 : Plotting LSMEANS and Differences in Generalized Linear Models with GTL
Robin High, University of Nebraska Medical Center
Monday, 10:30 AM - 10:50 AM, Location: Cozzens
A visual display of LsMeans and their pairwise differences in a generalized linear model is an important component of data analysis in order to interpret comparisons of Lsmeans. The SGPLOT procedure from SAS® software will produce graphs from an ANOVA for LSMeans and their differences with confidence intervals including the forest plot, the mean-mean scatter plot, and the mean-mean multiple comparison or MMC plot. Greater flexibility of making a combined plot of the LSMeans and their differences with a forest plot can be achieved with the Graph Template Language (GTL). The process consists of appending two data sets, one from the LSMeans and the other from the data set containing their differences produced with ODS OUTPUT statements, into one data set in block diagonal form and then submitting this data set to a template with SGRENDER. This graph can also be enhanced to include results from the lines option in the LSMEANS statement with PROC GLIMMIX and interpret pairwise differences sorted by the decreasing value of the LSMeans. These plots provide an introduction to producing complex graphs directly with graph templates including the use of macro variables for modifying graph layout and appearance and of dynamic variables for content. These techniques require a basic statistics background of ANOVA and experience with the SGPLOT procedures.
PH-07 : Utilizing Propensity Score Analyses to Adjust for Selection Bias: A Study of Adolescent Mental Illness and Substance Use
Deanna Schreiber-Gregory, National University
Tuesday, 1:00 PM - 1:50 PM, Location: Cozzens
An important strength of observational studies is the ability to estimate a key behavior or treatment's effect on a specific health outcome. This is a crucial strength as most health outcomes research studies are unable to use experimental designs due to ethical and other constraints. Keeping this in mind, one drawback of observational studies (that experimental studies naturally control for) is that they lack the ability to randomize their participants into treatment groups. This can result in the unwanted inclusion of a selection bias. One way to adjust for a selection bias is through the utilization of a propensity score analysis. In this study we provide an example of how to utilize these types of analyses. Our concern is whether recent substance abuse has an effect on an adolescent's identification of suicidal thoughts. In order to conduct this analysis, a selection bias was identified and adjustment was sought through three common forms of propensity scoring: stratification, matching, and regression adjustment. Each form is separately conducted, reviewed, and assessed as to its effectiveness in improving the model. Data for this study was gathered through the Youth Risk Behavior Surveillance System, an ongoing nationwide project of the Centers for Disease Control and Prevention. This presentation is designed for any level of statistician, SAS® programmer, or data analyst with an interest in controlling for selection bias, as well as for anyone who has an interest in the effects of substance abuse on mental illness.
PH-08 : Ushering SAS Emergency Medicine into the 21st Century: Toward Exception Handling Objectives, Actions, Outcomes, and Comms
Troy Hughes, No Affiliation
Monday, 11:00 AM - 11:50 AM, Location: Cozzens
Exception handling describes both the identification of and response to adverse, unexpected, or untimely events that can cause process or program failure, as well as anticipated events or environmental attributes that must be handled dynamically through prescribed, predetermined courses of action. Rapid error suppression and return to functioning is the hopeful end state but, when catastrophic events do occur, exception handling routines can terminate a process or program gracefully while providing meaningful execution and environmental metrics to developers both for remediation and future model refinement. SAS literature, however, too often depicts exception handling routines that either abruptly terminate the SAS session or which provide a static "exception report" to the log or ODS output stream, failing to capitalize on the full dynamic potential of exception handling. This text introduces the full array of potential reactions that software can take when an exception is encountered. Moreover, it presents various communication modalities to inform stakeholders of the exceptional event.
Posters
PO-01 : Keep the Formats When Exporting to ExcelTing Sa, Cincinnati Children's Hospital Medical Center
When using SAS to export data to Microsoft Excel, the formats get lost. This paper introduces a macro that will allow you to export formatted SAS data to Microsoft Excel without losing the formats. The advantage of this macro is that it only requires you to specify the input data set, and the name and pathname for the output Excel file. The macro creates the Excel file and preserves all the formats in the SAS data set.
Rapid Fire
RF-01 : Essential DATA Step Merge Techniques Using SAS® University Edition SoftwareKirk Paul Lafler, Software Intelligence Corporation
Charlie Shipp, Consider Consulting Corporation
Tuesday, 2:30 PM - 2:40 PM, Location: Cozzens
After installing SAS Institute's free SAS University Edition you'll want to test drive the software. SAS University Edition includes Base SAS, SAS/STAT, SAS/IML, Designer Studio (user interface), and SAS/ACCESS for Windows, with all the powerful features found in the licensed SAS versions. To demonstrate the power found within SAS University Edition, we present conventional and unconventional DATA step merge programming techniques using Base SAS software. All SAS users are encouraged to attend and learn essential concepts, syntax and programming techniques.
RF-02 : Essential PROC SQL Join Techniques Using SAS® University Edition Software
Kirk Paul Lafler, Software Intelligence Corporation
Tuesday, 2:00 PM - 2:10 PM, Location: Cozzens
After installing SAS Institute's free SAS University Edition you'll want to test drive the software. SAS University Edition includes Base SAS, SAS/STAT, SAS/IML, Designer Studio (user interface), and SAS/ACCESS for Windows, with all the powerful features found in the licensed SAS versions. To demonstrate the power found within SAS University Edition, we present the power and versatility of conventional and unconventional PROC SQL join programming techniques using Base SAS software. All SAS users are encouraged to attend and learn essential concepts, syntax and programming techniques.
RF-03 : SAS Dates with Decimal Time
John King, Ouachita Clinical Data Services, Inc.
Monday, 8:40 AM - 8:50 AM, Location: Washington City
Did you know that a SAS-Date can have a decimal part that SAS treats as you might expect a fraction of a day also known as time? Often when data are imported from EXCEL an excel date-time will be imported as SAS-Date with a decimal part. This Rapid Fire tip discusses how to identify these values and some examples of how to work with them including how to convert them to SAS-Date-time.
RF-04 : Reading a Column into a Row to Count N-levels, Calculate Cardinality Ratio and Create Frequency and Summary Output In One Step
Ronald Fehd, Stakana Analytics
Monday, 9:00 AM - 9:10 AM, Location: Washington City
This paper shows how read a column of numeric values into an array and use the sortn call routine in preparation for counting the number of levels of the variable. The primary goal of this algorithm is to calculate cardinality ratio which is n-levels divided by n-obs. This ratio can be used in Exploratory Data Analysis (EDA) to determine whether a variable is unique and therefore a row identifier, or discrete --- a classification variable --- or continuous --- an analysis variable. A useful benefit of traversing the array and counting n-levels is that the frequency counts and percents can be accumulated. Another benefit of having the values in an array is the ability to calculate summary statistics. The purpose of this paper is to show an optimized algorithm for calculating cardinality ratio in one data step. Previous algorithms required three steps, contents for n-obs, frequency for n-levels and a data step for the calculation.
RF-05 : 'V' for... Variable Information Functions to the Rescue
Richann Watson, Experis
Karl Miller, InVentiv Health
Monday, 9:20 AM - 9:30 AM, Location: Washington City
There are times when we need to use the attributes of a variable within a data set. Normally, this can be done with a simple CONTENTS procedure. The information can be viewed prior to programming and then hardcoded within the program or it can be saved to a data set that can be joined back to the main data set. If the attributes are hardcoded then what happens if the data set changes structure, then the program would need to be updated accordingly. If the information from PROC CONTENTS is saved and then joined with the main data set, then this would need to be done for all data sets that need to be processed. This is where knowing your 'V' functions can come in handy. The 'V' functions can be used to return the label, format, length, name, type and/or value of a variable or a string within the data step. These functions can come quite in handy when you need to create summary statistics and if you need to perform an algorithm on a variable with a specific naming convention.
RF-06 : Automated LSTtoRTFtoPDF Converter
Palanisamy Mohan
Monday, 9:40 AM - 9:50 AM, Location: Washington City
In Pharmaceutical industry the reports are commonly created in LST, RTF or PDF file format using SAS in order to meet the FDA submissions or as per client request. These reports files are created using the concept of ODS in SAS. There are often requirements to convert these files from LST to RTF, RTF to PDF or LST to PDF due to various reasons. To convert the files one has to manually open the file and save it as in required format, may use SAS or any third party software. Manual conversion becomes very tedious, time consuming and error prone when the number of files to be converted are more which is very expected in FDA submission. There are plenty of papers talks about the file conversion methods using SAS & VBA combinations which required certain level of knowledge on both SAS & VBA. In addition to that, one has to spend time to make SAS and VBA programs as per the instructions explained in those papers. The paper introduces one more approach through only VBA to automatically convert any number of files into desired file format and users need to have knowledge on VBA. All user inputs are controlled through an Excel based product which acts as an interface. Currently this has been tested for file conversion and will be enhanced for file compilations like making single RTF or PDF from multiple files.
RF-07 : Macro to Create Multiple SQL In-clauses from One Column of Data
Mark Millman, Optum
Monday, 10:30 AM - 10:40 AM, Location: Washington City
When faced with linking data from multiple sources it is often necessary to pull data from one database using a list of IDs from another. While this can be accomplished many ways, using SQL pass-thru is often the best. In order to leverage SQL pass-thru you must supply any needed information from outside the database right in the query. Often you have this information in a dataset outside of the server. This macro separates a given column of data into one or more SQL-compliant in clauses with a count of items that the user specifies. Products: SAS, SAS/Access, PROC SQL Skill Level: Intermediate, Macro Usage Audience: SQL programmers
RF-08 : Essentials of macro quoting functions in SAS
Kaushal Chaudhary, Sanford Research
Monday, 10:50 AM - 11:00 AM, Location: Washington City
SAS macro language is a text processing facility. Everything including numeric values is treated as text in SAS macro language. It is also composed of special characters, such as , (comma), ; ( semi colon), + (plus) and others. When a macro code contains + (plus) sign, macro processor sees it as an arithmetic operation addition rather than text. This might yield unintended result when you do not mean to add. Macro quoting functions come to rescue from this situation by treating it as text. This paper will introduce macro quoting functions with examples.
RF-09 : %SYSFUNC Is Your Friend
Kaushal Chaudhary, Sanford Research
Deanna Schreiber-Gregory, National University
Monday, 11:10 AM - 11:20 AM, Location: Washington City
SAS DATA step has a number of functions but macro functions are meager in macro language. %SYSFUNC allows the use of almost all DATA step and user written functions in macro environment, thus bridging the gap between SAS DATA step and macro language. This adds a lot of flexibility for SAS programmers to write programs in SAS macro. In this paper we will introduce %SYSFUNC and present several examples of uses of it.
RF-10 : A Macro for Systematic Treatment of Special Values in Weight of Evidence Variable Transformation
Chaoxian Cai, AFS Inc
Monday, 11:30 AM - 11:40 AM, Location: Washington City
Weight of evidence (WOE) recoding is a commonly applied technique in credit scoring model development to transform continuous predictor variables. Many predictor variables often come with some special values which are set beyond their normal ranges to annotate particular business meanings. PROC RANK and PROC UNIVARIATE are used to rank and group an input data into WOE bins for predictor variables. Special values are usually being placed at the last bucket after binning procedures. There is a possibility of losing valuable information and predictive power from these special values if we do not separate them into discrete bins. This paper presents a macro that can systematically separate every special value from their normal values and compute the associated WOE for each special value. The macro can handle hundreds of predictor variables in large scale. We can use this macro program to explore special values effectively and discover their potential predictive powers during the logistic model development process.
RF-11 : Why Would You Not Want to Learn about PROC DOCUMENT?
Roger Muller, Data To Events, Inc
Monday, 8:00 AM - 8:10 AM, Location: Washington City
Proc Procedure is little known proc that can save you vast amounts of time and effort when managing the output of your SAS programming efforts. This proc is deeply associated with the means by which SAS controls output in the ODS system. Have you ever wished you didn't have to modify and rerun the report generating program every time there was some tweak in the desired report. Proc Procedure allows you to store one version of the report and then call it out in many different output forms without rerunning the code -- e.g.. PDF, HTML, listing, RTF, etc. Have you ever wished you could extract those pages of the output that apply to certain "by" variables such as State, Student_Name, Car_Model? With Proc Procedure you have "where" capabilities to extract these. Do you wish to customize the table of contents that assorted SAS procedures produce when you make frames for the table of contents with HTML, or use the facilities available for PDF. Proc Procedure allows you to get to the inner workings of ODS and allows you to manipulate them. This paper addresses Proc Procedure from the viewpoint of end results, rather than a complete technical review of how to do the task at end. The emphasis will be on the benefits of using the procedure, not on detailed mechanics.
RF-12 : Find your Way to Quick and Easy Table Lookups With FINDW
Josh Horstman, Nested Loop Consulting
Tuesday, 11:10 AM - 11:20 AM, Location: Cozzens
Table lookups involve setting the value of one variable based on the value of another variable. There are many ways to perform table lookups in SAS, and many papers have been written on the various techniques. This paper demonstrates a quick and easy method using the FINDW function and discusses when it makes sense to use this method and when more robust methods might be preferable.
RF-13 : Building PROC FORMAT Code from Data Dictionary Automatically
Ran Gu, Nebraska DHHS /UNL
Ashley Newmyer, Nebraska Department of Health and Human Service
Monday, 8:20 AM - 8:30 AM, Location: Washington City
In Public Health, software data collection systems gather information according to a data dictionary or data coding manual. These documents provide tables for mapping a valid input with a corresponding value or number, and the software will only record the value into the dataset, not the descriptive syntax. In this case, Proc Format was used to convert the numbers back into words during analysis. Specifically Proc Format was used to map the numerical input back to the descriptive syntax for each data element. This was done according to the same data dictionary. However writing PROC FORMAT code itself would be a tedious work, especially when the dataset contains many numeric fields to convert back to words. We present a method in which SAS generates Proc Format code automatically for these variables all at once based on data dictionary table. The advantage of this methods is not only in reducing the time and effort put into editing the PROC FORMAT code for many variables, but also avoiding human error. In addition, this method also facilitates data integration from different data sources with different formats but that have compatible data dictionary tables. SAS Software and Platform: SAS 9.3/9.4 in WINDOWS 7 or Higher. Audience Expectation: Knowledge in SAS Input and Output System. Experienced in Data Manipulation.
RF-14 : Using a Picture Format to Create Visit Windows
Richann Watson, Experis
Tuesday, 10:30 AM - 10:40 AM, Location: Cozzens
Creating visit windows is sometimes required for analysis of data. We need to make sure that we get the visit/day in the proper window so that the data can be analyzed properly. However, defining these visit windows can be quite cumbersome especially if they have to be defined in numerous programs. This task can be made easier by applying a picture format, which can save a lot of time and coding. A format is easier to maintain than a bunch of individual programs. If a change to the algorithm is required, the format can be updated instead of updating all of the individual programs containing the visit definition code.
RF-15 : A Simple Adjustment for Selection Bias Through Use of Propensity Scoring
Deanna Schreiber-Gregory, National University
Tuesday, 9:40 AM - 9:50 AM, Location: Cozzens
Observational studies are vital to the exploration of health outcomes research. They allow researchers to estimate the effect of a treatment or behavior on a specific health outcome. This is something we would not be able to through an experimental procedure due to risk and ethical concerns. One concern of observational studies, however, is the fact that we cannot randomize participant placement into the treatment groups. This can result in the unwanted inclusion of a selection bias. One quick and easy way to adjust for a selection bias is through the utilization of a propensity score analysis through regression adjustment. In order to demonstrate how to do this, this presentation will seek to answer the question of how patients with a substance abuse/dependent diagnosis compare to the rest of the patient population in terms of status upon discharge. Data for this study was gathered through the National Hospital Discharge Survey, a nationwide project that collected data from acute care facilities in the United States until 2010. This presentation is designed for any level of statistician, SAS® programmer, or data analyst with an interest in controlling for selection bias, as well as for anyone who has an interest the explored topic.
RF-16 : Making Historical Versions of SAS Code While Developing in Enterprise Guide 7.1
Roger Muller, Data To Events, Inc
Tuesday, 10:50 AM - 11:00 AM, Location: Cozzens
A major new feature in SAS Enterprise Guide version 7.1 is the capability to retain historical versions of the SAS code as the programming is proceeding. The programmer has the option to return to a previous point where a "wrong turn in the road" was taken and resume the programming effort from there. All of this is done without keeping a folder of assorted backup versions with cleverly coded sequential names. This facility essentially contains a menu item to commit the version to the history of the file, and a menu item to review this history and select one to restore if desired. This feature is only for the SAS code files stored internally within the Enterprise Guide project. It is not for the project itself. Likewise it is not for SAS code files stored externally to the project. External files can be historically versioned by a 3rd party program such as Subversion which will be briefly addressed. These approaches are referred to as "Git" repositories for version control (the acronym has no direct computer meaning, but originates in English slang). A simple program workflow will show the used of this tool.
SAS 101
SA-01 : The REPORT Procedure: A Primer for the Compute BlockJane Eslinger, SAS Institute
Monday, 8:00 AM - 8:50 AM, Location: St Nick A
It is well-known in the world of SAS® programming that the REPORT procedure is one of the best procedures for creating dynamic reports. However, you might not realize that the compute block is where all of the action takes place! Its flexibility enables you to customize your output. This paper is a primer for using a compute block. With a compute block, you can easily change values in your output with the proper assignment statement and add text with the LINE statement. With the CALL DEFINE statement, you can adjust style attributes such as color and formatting. Through examples, you learn how to apply these techniques for use with any style of output. Understanding how to use the compute-block functionality empowers you to move from creating a simple report to creating one that is more complex and informative, yet still easy to use.
SA-02 : Introduction to PROC SQL
Jennifer First-Kluge, Systems Seminar Consultants, Inc.
Monday, 9:00 AM - 9:50 AM, Location: St Nick A
PROC SQL is a powerful Base SAS procedure combining some of the functionality of the DATA and PROC Steps into a single procedure. PROC SQL can be an efficient alternative to traditional SAS code. PROC SQL is often used as the interface to other data bases systems. Topics include: Write SQL code using various styles of the SELECT statement. Compute new columns while generating a query. Use SQL options to control the appearance of reports. Create multiple reports in a single PROC SQL. Use CASE/WHEN clauses for conditionally processing the data. Joining data from two or more data sets (like a MERGE). An overview of other SQL features: The 'Pass-Through' Facility to work with other DBMS tables. Efficiency issues using PROC SQL.
SA-03 : SAS Enterprise Guide for Managers and Executives
Jennifer First-Kluge, Systems Seminar Consultants, Inc.
Monday, 10:30 AM - 10:50 AM, Location: St Nick A
SAS Enterprise Guide is an extremely valuable tool for programmers, but it should also be leveraged by managers and executives to do data exploration, get information on the fly, and take advantage of the powerful analytics and reporting that SAS has to offer. This can all be done without learning to program. This paper will overview how the Enterprise Guide tool can improve the process of turning real time data into real time business decisions by managers.
SA-05 : Pruning the SASLOG - Digging into the Roots of NOTEs, WARNINGs, and ERRORs
Andrew Kuligowski, HSN
Monday, 1:00 PM - 1:50 PM, Location: St Nick A
You've sat through constant design meetings. You've endured countless requests for "just one more little change". You even managed to find a creative solution to that nagging technical problem. But, you persevered, and despite all of the obstacles, you've managed to eliminate the final syntax error in your newest SASÆÊ routine. Time to sit back and relax -- uh, not quite ... The primary focus of this presentation will be on techniques to ensure comprehension of your input data. We will look at several messages that are often found in the SASLOG, such as: NOTE: MERGE statement has more than one data set with repeats of BY values. that imply that there may be gaps in your knowledge of your data! Special emphasis will be placed on the use of ad-hoc queries to assist in finding data anomalies that can cause problems with your SAS code. It is assumed that the reader has a basic understanding of the SASLOG, including its composition, format, and the SAS system options which control its content.
SA-06 : Let SAS Cleanse Your Dirty Data
Kaushal Chaudhary, Sanford Research
Deanna Schreiber-Gregory, National University
Monday, 2:00 PM - 2:20 PM, Location: St Nick A
In an ideal world, every data set is complete, clean, and properly formatted. However, in real world situations, the data available to us is very rarely presented in this form. Data sets usually vary widely in their degrees of completeness, cleanliness, and readiness for an analytical application; with too many data sets lying in a realm of disarray that would make even the most experienced analyst cringe. They may contain any number of problematic events such as outliers, duplicate observations, missing values, invalid character and numeric data values, as well as many other issues that may appear during the data exploration phase of a project. Given the necessity that the data being examined is as complete and clean as possible, it is very important that these issues are addressed prior to any analysis. In this paper we describe several techniques for cleaning and preparing data for an analytical application through the use of data step functions (first.variable & last.variable, put & input, etc), base SAS procedures (Proc Univariate, Proc Freq, and Proc Means), and PROC SQL. These techniques are explored within the context of SAS 9.4 and presented in a way that would benefit beginning and moderate level SAS users
SA-07 : Getting and Staying Organized: Tips for Improving the SAS Data Analysis/Analyst Experience
Harlan Sayles, University of Nebraska Medical Center
Tuesday, 8:00 AM - 8:20 AM, Location: St Nick A
This paper discusses a variety of simple tips that users can used to be better organized and more efficient SAS users. Main points of emphasis include use of the autoexec.sas file, file structure organization, organization of the users' SAS program file, including use of the %include statement, options, proc format, and code organizaion, and the writing of good, meaningful comments. The tips and recommendations are not restricted to any specific software or version of SAS. The targeted audience ranges from newer SAS users to those who have been using SAS for a little while and are looking for information about how to improve their skill and organization.
SA-08 : Are You Missing Out? Working with Missing Values to Make the Most of What is not There
Art Carpenter, CA Occidental Consultants
Tuesday, 8:30 AM - 8:50 AM, Location: St Nick A
Everyone uses and works with missing values, however many SAS® programmers are unaware of the variety of tools, options, and techniques associated with using missing values. Did you know that there are 28 types of numeric missing values? Did you know that the numeric missing value (.) is neither the smallest or largest possible numeric missing value? Are you aware of the System options, DATA step functions, and DATA step routines that specifically deal with missing values? Do you understand how the macro null value is the same, and different from DATA step missing values? Are you aware that observations with missing classification variables may or may not be excluded from analyses depending on the procedure and various options? This paper explores various aspects of the world of missing values. The above questions and others are discussed. Learn more about missing values and make sure that you are not missing out.
SA-09 : Importing multiple spreadsheets in an Excel Workbook: An introduction to Macros
Nathan Becker, Pearson Vue
William Muntean, Pearson Vue
Tuesday, 9:00 AM - 9:20 AM, Location: St Nick A
The SAS import procedure makes importing Excel files quite simple. However, this procedure can only import a single spreadsheet. Importing more than one sheet from a larger Excel workbook requires using the procedure multiple times and knowing the name of each spreadsheet. Furthermore, handling Excel files that vary in spreadsheet naming conventions necessitates changing SAS code every time encountering a new naming scheme. This is undesirable, especially for repetitive reports. The following paper provides a solution to importing multiple Excel spreadsheets without knowing their names. The concept of macros is introduced and demonstrates the power of generalizable code by being reusable. The example below is a macro (%Excel) that imports every spreadsheet in an Excel workbook into SAS.
SA-10 : Improving Data Collection Efficiency and Programming Process Flow using SAS Enterprise Guide with Epi Info
Chad Wetzel, Douglas County Health Department
Justin Frederick, Douglas County Health Department
Anne O'Keefe, Douglas County Health Department
Tuesday, 9:30 AM - 9:50 AM, Location: St Nick A
Epi Info is a commonly used software program used by public health professionals for epidemiological data collection, analysis, and reporting. There are many benefits of using Epi Info in conjunction with SAS. Using each of these two systems enables users to benefit from the strengths of both while improving the efficiency of managing and analyzing epidemiological data. Epi Info provides an easy to use data entry form for manual data entry and SAS Enterprise Guide simplifies the complexities needed for running multiple SAS programs. In order to combine epidemiological data reported electronically with manually entered data in Epi Info, numerous lines of SAS code are required to match variables and formats that can be converted to Epi Info. It also requires a complex order of sorting and combining of data subsets in order to accurately and efficiently extract data from both systems, update data entries with all available data, and remove duplicated data entries. This paper focuses on how SAS Enterprise Guide better organizes the multiple steps required for complex projects, increases efficiency of data collection, creates intermediate datasets needing to be monitored, and generates epidemiological reports using output from multiple SAS programs.
SA-11 : Basic SAS® PROCedures for Producing Quick Results
Kirk Paul Lafler, Software Intelligence Corporation
Tuesday, 1:00 PM - 1:50 PM, Location: St Nick A
As data analysts, programmers, statisticians, and analytical professionals know, saving time is critical. Delivering timely and quality looking reports and information to management, end users, and customers is essential. The SAS System provides numerous "canned" PROCedures for generating quick results to take care of these needs ... and more. Attendees learn how basic SAS PROCedures such as PRINT, FORMS and SQL produce detail output; FREQ, MEANS, SQL, UNIVARIATE and SGPLOT summarize and create summary, tabular, statistical and graphical output; and utility PROCedures such as DATASETS to manage data libraries; Additional topics include techniques for informing the SAS System which data set to use as input to a procedure, how to subset data using a WHERE statement (or WHERE= data set option), and how to perform BY-group processing.
SA-12 : Top Ten SAS® Performance Tuning Techniques
Kirk Paul Lafler, Software Intelligence Corporation
Tuesday, 3:00 PM - 3:50 PM, Location: St Nick A
The Base-SAS® software provides users with many choices for accessing, manipulating, analyzing, and processing data and results. Partly due to the power offered by the SAS software and the size of data sources, many application developers and end-users are in need of guidelines for more efficient use. This presentation highlights my personal top ten list of performance tuning techniques for SAS users to apply in their applications. Attendees learn DATA and PROC step language statements and options that can help conserve CPU, I/O, data storage, and memory resources while accomplishing tasks involving processing, sorting, grouping, joining (merging), and summarizing data.
SA-13 : Downloading, Configuring, and Using the Free SAS® University Edition Software
Kirk Paul Lafler, Software Intelligence Corporation
Charlie Shipp, Consider Consulting Corporation
Tuesday, 10:30 AM - 11:20 AM, Location: St Nick A
The announcement of SAS Institute's free SAS University Edition is an exciting development for SAS users and learners around the world! The software bundle includes Base SAS, SAS/STAT, SAS/IML, Designer Studio (user interface), and SAS/ACCESS for Windows, with all the popular features found in the licensed SAS versions. This is an incredible opportunity for users, statisticians, data analysts, scientists, programmers, students, and academics everywhere to use (and learn) for career opportunities and advancement. Capabilities include data manipulation, data management, comprehensive programming language, powerful analytics, high quality graphics, world-renowned statistical analysis capabilities, and many other exciting features. This presentation discusses and illustrates the process of downloading and configuring the SAS University Edition. Additional topics include the process of downloading the required applications, key configuration strategies to run the SAS University Edition on your computer, and the demonstration of a few powerful features found in this exciting software bundle. We conclude with a summary of tips for success in downloading, configuring and using the SAS University Edition.
SA-14 : Beyond IF THEN ELSE: Techniques for Conditional Execution of SAS Code
Josh Horstman, Nested Loop Consulting
Tuesday, 11:30 AM - 11:50 AM, Location: St Nick A
Nearly every SAS program includes logic that causes certain code to be executed only when specific conditions are met. This is commonly done using the IF...THEN...ELSE syntax. In this paper, we will explore various ways to construct conditional SAS logic, including some that may provide advantages over the IF statement. Topics will include the SELECT statement, the IFC and IFN functions, the COALESCE and COALESCEC functions, as well as some more esoteric methods, and we'll make sure we understand the difference between a regular IF and the %IF statement in the macro language.
SA-15 : Good SAS Programming Practices
Pat O'Meara, Pat O'Meara Associates, Inc.
Tuesday, 2:00 PM - 2:50 PM, Location: St Nick A
The PhUsewiki website is a collaboration of PhUse, FDA, and pharmaceutical industry members to provide a means to share information related to clinical trials informatics. Among the documents on the website is a white paper describing principles of good programming practice (GPP). This presentation describes the principles of GPP as they relate to SAS programming. Several "before and after" examples taken from real SAS programs are presented.
Tools of the Trade
TT-01 : Dynamic Dashboards Using Base-SAS® SoftwareKirk Paul Lafler, Software Intelligence Corporation
Tuesday, 9:00 AM - 9:50 AM, Location: Washington City
Dynamic interactive visual displays known as dashboards are most effective when they show essential graphs, tables, statistics, and other information where data is the star. The first rule for creating an effective SAS® dashboard is to keep it simple. Striking a balance between content and style, a dashboard should be void of excessive clutter so as not to distract and obscure the information displayed. The second rule of effective dashboard design involves displaying data that meets one or more business or organizational objectives. To accomplish this, the elements in a dashboard should convey a format easily understood by its intended audience. Attendees learn how to create dynamic interactive user- and data-driven dashboards, graphical and table-driven dashboards, statistical dashboards, and drill-down dashboards with a purpose using Base-SAS® programming techniques including DATA step, PROC FORMAT, PROC PRINT, PROC MEANS, PROC SQL, ODS, Statistical Graphics and HTML.
TT-02 : You Can't Spell 'Assume' without S.A.S.
Mike Tangedal, Capella University
Tuesday, 8:00 AM - 8:50 AM, Location: Washington City
The world of SAS programming is fraught with assumptions. One major assumption is to immediately apply fancy reporting SAS procedures directly to source data. However, preventative programming logic applied during the data step also might fall prey to assumptions due to the whims of the Program Data Vector. This paper serves to highlight the most frequent assumptions on the road from source data to reportable results. Included is the saga of date formatting, default value overriding, importing and exporting files, table joins, variable length and format, as well as assumptions made from what is not shown in the data. Through proactive SAS programming, assumptions can be eliminated resulting in the most credible reported results.
TT-03 : Leveraging Hadoop from the Comfort of SAS
Emily Hawkins, UnitedHealthcare
Lal Puthenveedu Rajanpillai, UnitedHealthcare
Tuesday, 10:30 AM - 11:20 AM, Location: Washington City
Using Hadoop for distributed computing is rapidly becoming the most talked about, most sought after solution, and at times confusing ecosystem for big data. With a vast range of capabilities from data access tools to provisioning and governance tools, one can quickly become overwhelmed in a sea of funny names. However, by understanding a little bit about Hadoop the average SAS programmer can comfortably merge the Power of SAS with the distributed computing capabilities of Hadoop. Leveraging Hadoop from the Comfort of SAS will provide a basic introduction to Hadoop from a SAS programmer's perspective. It will explain why many companies see Hadoop as imperative to adopt into their enterprise solution. We we'll cover topics on connecting to HDFS using both LIBNAME and PROC SQL passthrough, as well as moving data in and out of HDFS with Pig using PROC HADOOP. One of the most important aspects of using Hadoop with SAS is pushing processing to the distributed cluster so we will also cover how to ensure that you are leveraging the power of Hadoop to run efficient programs.
TT-04 : Multiple Imputation for Arbitrary Missing Data: SAS vs R
Kaushal Chaudhary, Sanford Research
Deanna Schreiber-Gregory, National University
Tuesday, 11:30 AM - 11:50 AM, Location: Washington City
Missing data values are a common problem in the vast majority of real world data analysis situations. With this problem being as prominent as it is, it is important to have a well-rounded arsenal of skills to combat it. Multiple imputation is one such tactic that is well-supported as an effective resolution to this issue, especially when the missing data has an arbitrary component to it. Multiple imputation can be conducted through several different programs, but SAS and R are by far the most popular choices. If both of these programs are available to the analyst, which should you choose? In this paper, the authors review different methods for conducting multiple imputation on arbitrary values in both SAS and R, with emphasis placed on the differences between these two programs. These techniques are explored within the context of SAS 9.4 and presented in a way that would benefit beginning and moderate level SAS users, especially those versed in both SAS and R.
TT-05 : SAS® Enterprise Guide® Base SAS® Program Nodes ~ Automating Your SAS World With a Dynamic FILENAME Statement, Dynamic Code, and the CALL EXECUTE Command; Your Newest BFF (Best Friends Forever) in SAS
Kent Phelps, The SASketeers
Ronda Phelps, The SASketeers
Tuesday, 1:00 PM - 1:50 PM, Location: Washington City
Communication is the basic foundation of all relationships, including our relationship with SAS and the Server, PC, or Mainframe. One way to communicate more efficiently, and to increasingly automate your SAS World, is to transform Static Code into Dynamic Code that automatically recreates the original Static Code, and then executes the Static Code automatically. Our presentation highlights the powerful SAS Partnership which occurs when a Dynamic FILENAME Statement, Dynamic Code, and the CALL EXECUTE Command are creatively combined within SAS Enterprise Guide Base SAS Program Nodes. You will have the exciting opportunity to learn how 1,469 time-consuming Manual Steps are amazingly replaced with only 2 time-saving Dynamic Automated Steps. We invite you to attend our session where we will detail the UNIX syntax for our project example and introduce you to your newest BFF (Best Friends Forever) in SAS. (Please see the Appendices to review starting point information regarding the syntax for Windows and z/OS, and to review the code that created the data sets for our project example.)
TT-06 : Using SAS Programs to Conduct Discriminate Analysis
Mengyu Liu, University of Southern California
Tuesday, 2:00 PM - 2:20 PM, Location: Washington City
This paper discusses how SAS programs used to conduct discriminate analysis. Discriminant function analysis is used to determine which variables discriminate between two or more naturally occurring groups. Vocalization data contains the recordings of calls that contain eight features of harp seals in three herds was used. The goal of this analysis is to determine whether the vocalization data can be used to construct a rule which discriminates between the three herds of seals. Stepwise regression was used to select main effect variables. PROC DISCRIM procedure was used to conduct the analysis, and option POOL=TEST was added to test whether the same variance-covariance matrix of response across different groups. If the assumption met, a linear discriminant function would be assumed, otherwise, a quadratic discriminant function would be derived. Moreover, since the multivariate normal assumptions were not satisfied, a nonparametric method was also applied. Kth-nearest-neighbor (KNN) rule and a nonparametric method based on kernel density estimates were used to compare results from KNN. Error rates (misclassification rates) were compared in different methods and rules with error rates that smaller than the rate if randomly assigned would be considered. This session provides insight into practical use of SAS and the process of discriminate analysis.
TT-09 : A Demonstration of SAS Analytics in Teradata
Helen Fowler, Teradata
Tho Nguyen, Teradata
Tuesday, 3:00 PM - 3:20 PM, Location: Washington City
SAS analytics in Teradata refers to the integration of advanced analytics into the data warehouse. With this capability, analytic processing is optimized to run where the data reside, in parallel, without having to copy or move the data for analysis. Many analytical computing solutions and large databases use this technology because it provides significant performance improvements over more traditional methods. Come see how SAS Analytics in Teradata works and learn some of the best