Do you understand the Program Data Vector? Stick to the basics. Did you ever get that advice? Two of the papers at Mid. Of the SET statement, there is a brief review of the SAS Supervisor and Program Data Vector concepts on the next page. Uses and Abuses of the SET Statement. What is Program Data Vector (PDV)? What are its functions? SAS will have created a program data vector containing the variables from the moredemog and morestatus data sets in the order in which they appear in the. 1 Paper 255-2012 The Use and Abuse of the Program Data Vector Jim Johnson, Ephicacy Corporation, North Wales, PA, USA ABSTRACT Have you ever wondered why SAS West SAS Users Group 2. SAS processing concepts—the Program Data Vector—to show why users might encounter unexpected errors in their DATA step programs. In The Secret Life of DATA STEP, Swati Agarwal shared a refresher on how a SAS DATA step compiles and executes behind the scenes. She reminded listeners that each SAS DATA step functions as a self- contained mini- program that is compiled and then executed in an implied loop. Agarwal explains the Program Data Vector this way: it’s a storage place in memory that contains all of the variables encountered by your DATA step. The PDV is where SAS builds the data set, one observation at a time. In this video we understand how SAS carries out data steps using the program data vector. We see how SAS handles merging and concatenating of data. Two papers at MWSUG 2013 use the Program Data Vector to help users improve their DATA step programming skills. During processing, the DATA step also generates certain automatic variables that can be used for further processing. She says that when you want to do complex processing, you’ll want “want concrete knowledge of what the PDV is holding and the rules SAS observes in manipulating that information.”Agarwal emphasized if SAS programmers understand what happens during each of these three important aspects of DATA step processing- -compile phase, execute phase and PDV—then they can exercise better control over how data are read and output. While the PDV is most commonly associated with reading raw data into a SAS data set, a PDV is also created whenever the DATA step contains a MERGE, SET, MODIFY or UPDATE statements. More importantly, default processing may behave differently. In Anatomy of a Merge Gone Wrong, James Lew and Joshua Horstman shared one programming pitfall that requires concrete knowledge of the PDV. Errors with merging data sets can arise from a number of sources, including not understanding the inner workings of the DATA step. Their paper explains how the automatic retain in DATA step processing can trip the most wary of SAS programmers. There is a common misconception that the values in the PDV are always reset to missing when processing returns to the top of the DATA step during execution phase. However, when reading data with a SET, MERGE, MODIFY or UPDATE statement, variable values are automatically retained from one iteration to the next. Lew and Horstman suggest two ways to avoid errors caused by the automatic retain: merge data sets in a separate DATA step and then perform any additional processing in subsequent DATA stepsrename one of the input variables. Both papers emphasize why it’s worth learning how the DATA step really works. You’ll want to check these additional resources.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. Archives
January 2017
Categories |