Jelenlegi hely

PostreSQL.org

Feliratkozás PostreSQL.org hírcsatorna csatornájára
PostgreSQL news
Frissítve: 2 nap 18 óra

PostgreSQL Weekly News - April 11, 2021

2021, április 12 - 02:00
PostgreSQL Weekly News - April 11, 2021

Feature freeze for PostgreSQL 14 has arrived. Any new feature that could be in PostgreSQL 14 is in the git repository.

PostgreSQL Product News

AGE 0.4.0, a PostgreSQL extension that provides graph database functionality, released. https://github.com/apache/incubator-age/releases/tag/0.4.0

PostgreSQL Jobs for April

https://archives.postgresql.org/pgsql-jobs/2021-04/

PostgreSQL in the News

Planet PostgreSQL: https://planet.postgresql.org/

PostgreSQL Weekly News is brought to you this week by David Fetter

Submit news and announcements by Sunday at 3:00pm PST8PDT to david@fetter.org.

Applied Patches

Tom Lane pushed:

  • Fix more confusion in SP-GiST. spg_box_quad_leaf_consistent unconditionally returned the leaf datum as leafValue, even though in its usage for poly_ops that value is of completely the wrong type. In versions before 12, that was harmless because the core code did nothing with leafValue in non-index-only scans ... but since commit 2a6368343, if we were doing a KNN-style scan, spgNewHeapItem would unconditionally try to copy the value using the wrong datatype parameters. Said copying is a waste of time and space if we're not going to return the data, but it accidentally failed to fail until I fixed the datatype confusion in ac9099fc1. Hence, change spgNewHeapItem to not copy the datum unless we're actually going to return it later. This saves cycles and dodges the question of whether lossy opclasses are returning the right type. Also change spg_box_quad_leaf_consistent to not return data that might be of the wrong type, as insurance against somebody introducing a similar bug into the core code in future. It seems like a good idea to back-patch these two changes into v12 and v13, although I'm afraid to change spgNewHeapItem's mistaken idea of which datatype to use in those branches. Per buildfarm results from ac9099fc1. Discussion: https://postgr.es/m/3728741.1617381471@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/dfc843d465689d2c2af8b0e01c66c51ccaae2343

  • Support INCLUDE'd columns in SP-GiST. Not much to say here: does what it says on the tin. We steal a previously-always-zero bit from the nextOffset field of leaf index tuples in order to track whether there is a nulls bitmap. Otherwise it works about like included columns in other index types. Pavel Borisov, reviewed by Andrey Borodin and Anastasia Lubennikova, and rather heavily editorialized on by me Discussion: https://postgr.es/m/CALT9ZEFi-vMp4faht9f9Junb1nO3NOSjhpxTmbm1UGLMsLqiEQ@mail.gmail.com https://git.postgresql.org/pg/commitdiff/09c1c6ab4bc5764dd69c53ccfd43b2060b1fd090

  • Clean up treatment of missing default and CHECK-constraint records. Andrew Gierth reported that it's possible to crash the backend if no pg_attrdef record is found to match an attribute that has atthasdef set. AttrDefaultFetch warns about this situation, but then leaves behind a relation tupdesc that has null "adbin" pointer(s), which most places don't guard against. We considered promoting the warning to an error, but throwing errors during relcache load is pretty drastic: it effectively locks one out of using the relation at all. What seems better is to leave the load-time behavior as a warning, but then throw an error in any code path that wants to use a default and can't find it. This confines the error to a subset of INSERT/UPDATE operations on the table, and in particular will at least allow a pg_dump to succeed. Also, we should fix AttrDefaultFetch to not leave any null pointers in the tupdesc, because that just creates an untested bug hazard. While at it, apply the same philosophy of "warn at load, throw error only upon use of the known-missing info" to CHECK constraints. CheckConstraintFetch is very nearly the same logic as AttrDefaultFetch, but for reasons lost in the mists of time, it was throwing ERROR for the same cases that AttrDefaultFetch treats as WARNING. Make the two functions more nearly alike. In passing, get rid of potentially-O(N^2) loops in equalTupleDesc by making AttrDefaultFetch sort the entries after fetching them, so that equalTupleDesc can assume that entries in two equal tupdescs must be in matching order. (CheckConstraintFetch already was sorting CHECK constraints, but equalTupleDesc hadn't been told about it.) There's some argument for back-patching this, but with such a small number of field reports, I'm content to fix it in HEAD. Discussion: https://postgr.es/m/87pmzaq4gx.fsf@news-spur.riddles.org.uk https://git.postgresql.org/pg/commitdiff/091e22b2e673e3e8480abd68fbb827c5d6979615

  • Fix missing #include in nodeResultCache.h. Per cpluspluscheck. https://git.postgresql.org/pg/commitdiff/789d81de8a50d9a23cc1a3b8ea5d839246020689

  • Postpone some stuff out of ExecInitModifyTable. Arrange to do some things on-demand, rather than immediately during executor startup, because there's a fair chance of never having to do them at all: * Don't open result relations' indexes until needed. * Don't initialize partition tuple routing, nor the child-to-root tuple conversion map, until needed. This wins in UPDATEs on partitioned tables when only some of the partitions will actually receive updates; with larger partition counts the savings is quite noticeable. Also, we can remove some sketchy heuristics in ExecInitModifyTable about whether to set up tuple routing. Also, remove execPartition.c's private hash table tracking which partitions were already opened by the ModifyTable node. Instead use the hash added to ModifyTable itself by commit 86dc90056. To allow lazy computation of the conversion maps, we now set ri_RootResultRelInfo in all child ResultRelInfos. We formerly set it only in some, not terribly well-defined, cases. This has user-visible side effects in that now more error messages refer to the root relation instead of some partition (and provide error data in the root's column order, too). It looks to me like this is a strict improvement in consistency, so I don't have a problem with the output changes visible in this commit. Extracted from a larger patch, which seemed to me to be too messy to push in one commit. Amit Langote, reviewed at different times by Heikki Linnakangas and myself Discussion: https://postgr.es/m/CA+HiwqG7ZruBmmih3wPsBZ4s0H2EhywrnXEduckY5Hr3fWzPWA@mail.gmail.com https://git.postgresql.org/pg/commitdiff/c5b7ba4e67aeb5d6f824b74f94114d99ed6e42b7

  • Postpone some more stuff out of ExecInitModifyTable. Delay creation of the projections for INSERT and UPDATE tuples until they're needed. This saves a pretty fair amount of work when only some of the partitions are actually touched. The logic associated with identifying junk columns in UPDATE/DELETE is moved to another loop, allowing removal of one loop over the target relations; but it didn't actually change at all. Extracted from a larger patch, which seemed to me to be too messy to push in one commit. Amit Langote, reviewed at different times by Heikki Linnakangas and myself Discussion: https://postgr.es/m/CA+HiwqG7ZruBmmih3wPsBZ4s0H2EhywrnXEduckY5Hr3fWzPWA@mail.gmail.com https://git.postgresql.org/pg/commitdiff/a1115fa0782378a8238045d238ae70cac36be8ae

  • Tighten up allowed names for custom GUC parameters. Formerly we were pretty lax about what a custom GUC's name could be; so long as it had at least one dot in it, we'd take it. However, corner cases such as dashes or equal signs in the name would cause various bits of functionality to misbehave. Rather than trying to make the world perfectly safe for that, let's just require that custom names look like "identifier.identifier", where "identifier" means something that scan.l would accept without double quotes. Along the way, this patch refactors things slightly in guc.c so that find_option() is responsible for reporting GUC-not-found cases, allowing removal of duplicative code from its callers. Per report from Hubert Depesz Lubaczewski. No back-patch, since the consequences of the problem don't seem to warrant changing behavior in stable branches. Discussion: https://postgr.es/m/951335.1612910077@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/3db826bd55cd1df0dd8c3d811f8e5b936d7ba1e4

  • Comment cleanup for a1115fa07. Amit Langote Discussion: https://postgr.es/m/CA+HiwqEcawatEaUh1uTbZMEZTJeLzbroRTz9_X9Z5CFjTWJkhw@mail.gmail.com https://git.postgresql.org/pg/commitdiff/0d46771eaaf77ad08555cf34e421234d5943edfa

  • Remove channel binding requirement from clientcert=verify-full test. This fails on older OpenSSL versions that lack channel binding support. Since that feature is not essential to this test case, just remove it, instead of complicating matters. Per buildfarm. Jacob Champion Discussion: https://postgr.es/m/fa8dbbb58c20b1d1adf0082769f80d5466eaf485.camel@vmware.com https://git.postgresql.org/pg/commitdiff/a282ee68a070a8adc6e6d45e8e643769c587ecc3

  • Allow psql's \df and \do commands to specify argument types. When dealing with overloaded function or operator names, having to look through a long list of matches is tedious. Let's extend these commands to allow specification of (input) argument types to let such results be trimmed down. Each additional argument is treated the same as the pattern argument of \dT and matched against the appropriate argument's type name. While at it, fix \dT (and these new options) to recognize the usual notation of "foo[]" for "the array type over foo", and to handle the special abbreviations allowed by the backend grammar, such as "int" for "integer". Greg Sabino Mullane, revised rather significantly by me Discussion: https://postgr.es/m/CAKAnmmLF9Hhu02N+s7uAyLc5J1xZReg72HQUoiKhNiJV3_jACQ@mail.gmail.com https://git.postgresql.org/pg/commitdiff/a3027e1e7f3d3a107ecd72d3b4d6333ea2aab6a5

  • Suppress uninitialized-variable warning. Several buildfarm critters that don't usually produce such warnings are complaining about e717a9a18. I think it's actually safe, but move initialization to silence the warning. https://git.postgresql.org/pg/commitdiff/01add89454d5dc289ed3126d5de03169bdeff41b

  • Add support for tab-completion of type arguments in \df, \do. Oversight in commit a3027e1e7. https://git.postgresql.org/pg/commitdiff/d1fcbde579d440c35023baa0de7ebf27f644a314

  • Doc: update documentation of check_function_bodies. Adjust docs and description string to note that check_function_bodies applies to procedures too. (In hindsight it should have been named check_routine_bodies, but it seems too late for that now.) Daniel Westermann Discussion: https://postgr.es/m/GV0P278MB04834A9EB9A74B036DC7CE49D2739@GV0P278MB0483.CHEP278.PROD.OUTLOOK.COM https://git.postgresql.org/pg/commitdiff/07b76833b15163c6574ea2c12d05d9a0800665e2

  • Fix failure of xlogprefetch.h to include all prerequisite headers. Per cpluspluscheck. https://git.postgresql.org/pg/commitdiff/99964c4ade468c35a3f6e248a2380a1ff67d9cd3

  • Fix uninitialized variable from commit a4d75c86b. The path for *exprs != NIL would misbehave, and likely crash, since pull_varattnos expects its last argument to be valid at call. Found by Coverity --- we have no coverage of this path in the regression tests. https://git.postgresql.org/pg/commitdiff/9cb92334092fa75afc62a71243bbc1f4612ecfa4

  • Add macro PGWARNING, and make PGERROR available on all platforms. We'd previously noted the need for coping with Windows headers that provide some other definition of macro "ERROR" than elog.h does. It turns out that R also wants to define ERROR, and WARNING too. PL/R has been working around this in a hacky way that broke when we recently changed the numeric value of ERROR. To let them have a more future-proof solution, provide an alternate macro PGWARNING for WARNING, and make PGERROR visible always, not only when #ifdef WIN32. Discussion: https://postgr.es/m/CADK3HHK6iMChd1yoOqssxBn5Z14Zar8Ztr3G-N_fuG7F8YTP3w@mail.gmail.com https://git.postgresql.org/pg/commitdiff/d7cff12c4c035b7cf12bb8454824f48f13018730

Michaël Paquier pushed:

  • Refactor all TAP test suites doing connection checks. This commit refactors more TAP tests to adapt with the recent introduction of connect_ok() and connect_fails() in PostgresNode, introduced by 0d1a3343. This changes the following test suites to use the same code paths for connection checks: - Kerberos - LDAP - SSL - Authentication Those routines are extended to be able to handle optional parameters that are set depending on each suite's needs, as of: - custom SQL query. - expected stderr matching pattern. - expected stdout matching pattern. The new design is extensible with more parameters, and there are some plans for those routines in the future with checks based on the contents of the backend logs. Author: Jacob Champion, Michael Paquier Discussion: https://postgr.es/m/d17b919e27474abfa55d97786cb9cfadfe2b59e9.camel@vmware.com https://git.postgresql.org/pg/commitdiff/c50624cdd248c13b4ba199f95e24c88d2cc8a097

  • Fix typo in collationcmds.c. Introduced by 51e225d. Author: Anton Voloshin Discussion: https://postgr.es/m/05477da0-703c-7de7-998c-5879738e8f39@postgrespro.ru https://git.postgresql.org/pg/commitdiff/9f6f1f9b8e61f9ce47e1936fc68c21a4a8d6722c

  • Change PostgresNode::connect_fails() to never send down queries. This type of failure is similar to what has been fixed in c757a3da, where an authentication failure combined with psql pushing a command down its communication pipe causes a test failure. This routine is designed to fail, so sending a query has little sense anyway. Per buildfarm members gaur and hoverfly, based on an analysis and fix from Tom Lane. Discussion: https://postgr.es/m/513200.1617634642@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/6d41dd045ada28ee14182112fc4cf50fb3879d28

  • Fix some issues with SSL and Kerberos tests. The recent refactoring done in c50624c accidentally broke a portion of the kerberos tests checking after a query, so add its functionality back. Some inactive SSL tests had their arguments in an incorrect order, which would cause them to fail if they were to run. Author: Jacob Champion Discussion: https://postgr.es/m/4f5b0b3dc0b6fe9ae6a34886b4d4000f61eb567e.camel@vmware.com https://git.postgresql.org/pg/commitdiff/5a71964a832febfee23cedc3bb354049d6ca78a7

  • Add some information about authenticated identity via log_connections. The "authenticated identity" is the string used by an authentication method to identify a particular user. In many common cases, this is the same as the PostgreSQL username, but for some third-party authentication methods, the identifier in use may be shortened or otherwise translated (e.g. through pg_ident user mappings) before the server stores it. To help administrators see who has actually interacted with the system, this commit adds the capability to store the original identity when authentication succeeds within the backend's Port, and generates a log entry when log_connections is enabled. The log entries generated look something like this (where a local user named "foouser" is connecting to the database as the database user called "admin"): LOG: connection received: host=[local] LOG: connection authenticated: identity="foouser" method=peer (/data/pg_hba.conf:88) LOG: connection authorized: user=admin database=postgres application_name=psql Port->authn_id is set according to the authentication method: bsd: the PostgreSQL username (aka the local username) cert: the client's Subject DN gss: the user principal ident: the remote username ldap: the final bind DN pam: the PostgreSQL username (aka PAM username) password (and all pw-challenge methods): the PostgreSQL username peer: the peer's pw_name radius: the PostgreSQL username (aka the RADIUS username) sspi: either the down-level (SAM-compatible) logon name, if compat_realm=1, or the User Principal Name if compat_realm=0 The trust auth method does not set an authenticated identity. Neither does clientcert=verify-full. Port->authn_id could be used for other purposes, like a superuser-only extra column in pg_stat_activity, but this is left as future work. PostgresNode::connect_{ok,fails}() have been modified to let tests check the backend log files for required or prohibited patterns, using the new log_like and log_unlike parameters. This uses a method based on a truncation of the existing server log file, like issues_sql_like(). Tests are added to the ldap, kerberos, authentication and SSL test suites. Author: Jacob Champion Reviewed-by: Stephen Frost, Magnus Hagander, Tom Lane, Michael Paquier Discussion: https://postgr.es/m/c55788dd1773c521c862e8e0dddb367df51222be.camel@vmware.com https://git.postgresql.org/pg/commitdiff/9afffcb833d3c5e59a328a2af674fac7e7334fc1

  • Remove redundant memset(0) calls for page init of some index AMs. Bloom, GIN, GiST and SP-GiST rely on PageInit() to initialize the contents of a page, and this routine fills entirely a page with zeros for a size of BLCKSZ, including the special space. Those index AMs have been using an extra memset() call to fill with zeros the special page space, or even the whole page, which is not necessary as PageInit() already does this work, so let's remove them. GiST was not doing this extra call, but has commented out a system call that did so since 6236991. While on it, remove one MAXALIGN() for SP-GiST as PageInit() takes care of that. This makes the whole page initialization logic more consistent across all index AMs. Author: Bharath Rupireddy Reviewed-by: Vignesh C, Mahendra Singh Thalor Discussion: https://postgr.es/m/CALj2ACViOo2qyaPT7krWm4LRyRTw9kOXt+g6PfNmYuGA=YHj9A@mail.gmail.com https://git.postgresql.org/pg/commitdiff/4c0239cb7a7775e3183cb575e62703d71bf3302d

  • Fix some failures with connection tests on Windows hosts. The truncation of the log file, that this set of tests relies on to make sure that a connection attempt matches with its expected backend log pattern, fails, as reported by buildfarm member fairywren. Instead of a truncation, do a rotation of the log file and restart the node. This will ensure that the connection attempt data is unique for each test. Discussion: https://postgr.es/m/YG05nCI8x8B+Ad3G@paquier.xyz https://git.postgresql.org/pg/commitdiff/c7578fa64019f27edc31261ea49066a4b2569a6c

  • Fix typos and grammar in documentation and code comments. Comment fixes are applied on HEAD, and documentation improvements are applied on back-branches where needed. Author: Justin Pryzby Discussion: https://postgr.es/m/20210408164008.GJ6592@telsasoft.com Backpatch-through: 9.6 https://git.postgresql.org/pg/commitdiff/609b0652af00374b89411ea2613fd5bb92bca92c

Peter Eisentraut pushed:

Álvaro Herrera pushed:

Fujii Masao pushed:

  • Shut down transaction tracking at startup process exit. Maxim Orlov reported that the shutdown of standby server could result in the following assertion failure. The cause of this issue was that, when the shutdown caused the startup process to exit, recovery-time transaction tracking was not shut down even if it's already initialized, and some locks the tracked transactions were holding could not be released. At this situation, if other process was invoked and the PGPROC entry that the startup process used was assigned to it, it found such unreleased locks and caused the assertion failure, during the initialization of it. TRAP: FailedAssertion("SHMQueueEmpty(&(MyProc->myProcLocks[i]))" This commit fixes this issue by making the startup process shut down transaction tracking and release all locks, at the exit of it. Back-patch to all supported branches. Reported-by: Maxim Orlov Author: Fujii Masao Reviewed-by: Maxim Orlov Discussion: https://postgr.es/m/ad4ce692cc1d89a093b471ab1d969b0b@postgrespro.ru https://git.postgresql.org/pg/commitdiff/ad8b674922eb70dc5cd02951dd82fe2c4c37c80a

  • Add function to log the memory contexts of specified backend process. Commit 3e98c0bafb added pg_backend_memory_contexts view to display the memory contexts of the backend process. However its target process is limited to the backend that is accessing to the view. So this is not so convenient when investigating the local memory bloat of other backend process. To improve this situation, this commit adds pg_log_backend_memory_contexts() function that requests to log the memory contexts of the specified backend process. This information can be also collected by calling MemoryContextStats(TopMemoryContext) via a debugger. But this technique cannot be used in some environments because no debugger is available there. So, pg_log_backend_memory_contexts() allows us to see the memory contexts of specified backend more easily. Only superusers are allowed to request to log the memory contexts because allowing any users to issue this request at an unbounded rate would cause lots of log messages and which can lead to denial of service. On receipt of the request, at the next CHECK_FOR_INTERRUPTS(), the target backend logs its memory contexts at LOG_SERVER_ONLY level, so that these memory contexts will appear in the server log but not be sent to the client. It logs one message per memory context. Because if it buffers all memory contexts into StringInfo to log them as one message, which may require the buffer to be enlarged very much and lead to OOM error since there can be a large number of memory contexts in a backend. When a backend process is consuming huge memory, logging all its memory contexts might overrun available disk space. To prevent this, now this patch limits the number of child contexts to log per parent to 100. As with MemoryContextStats(), it supposes that practical cases where the log gets long will typically be huge numbers of siblings under the same parent context; while the additional debugging value from seeing details about individual siblings beyond 100 will not be large. There was another proposed patch to add the function to return the memory contexts of specified backend as the result sets, instead of logging them, in the discussion. However that patch is not included in this commit because it had several issues to address. Thanks to Tatsuhito Kasahara, Andres Freund, Tom Lane, Tomas Vondra, Michael Paquier, Kyotaro Horiguchi and Zhihong Yu for the discussion. Bump catalog version. Author: Atsushi Torikoshi Reviewed-by: Kyotaro Horiguchi, Zhihong Yu, Fujii Masao Discussion: https://postgr.es/m/0271f440ac77f2a4180e0e56ebd944d1@oss.nttdata.com https://git.postgresql.org/pg/commitdiff/43620e328617c1f41a2a54c8cee01723064e3ffa

  • Fix typo in pgstat.c. Introduced by 9868167500. Author: Vignesh C Discussion: https://postgr.es/m/CALDaNm1DqgaLBAJrtGznKk1sR1mH-augmp7LfGvxWwTUhah+rg@mail.gmail.com https://git.postgresql.org/pg/commitdiff/f5d94e405e17a49487672316610630be2f9d0bb7

  • Stop archive recovery if WAL generated with wal_level=minimal is found. Previously if hot standby was enabled, archive recovery exited with an error when it found WAL generated with wal_level=minimal. But if hot standby was disabled, it just reported a warning and continued in that case. Which could lead to data loss or errors during normal operation. A warning was emitted, but users could easily miss that and not notice this serious situation until they encountered the actual errors. To improve this situation, this commit changes archive recovery so that it exits with FATAL error when it finds WAL generated with wal_level=minimal whatever the setting of hot standby. This enables users to notice the serious situation soon. The FATAL error is thrown if archive recovery starts from a base backup taken before wal_level is changed to minimal. When archive recovery exits with the error, if users have a base backup taken after setting wal_level to higher than minimal, they can recover the database by starting archive recovery from that newer backup. But note that if such backup doesn't exist, there is no easy way to complete archive recovery, which may make the database server unstartable and users may lose whole database. The commit adds the note about this risk into the document. Even in the case of unstartable database server, previously by just disabling hot standby users could avoid the error during archive recovery, forcibly start up the server and salvage data from it. But note that this commit makes this procedure unavailable at all. Author: Takamichi Osumi Reviewed-by: Laurenz Albe, Kyotaro Horiguchi, David Steele, Fujii Masao Discussion: https://postgr.es/m/OSBPR01MB4888CBE1DA08818FD2D90ED8EDF90@OSBPR01MB4888.jpnprd01.prod.outlook.com https://git.postgresql.org/pg/commitdiff/9de9294b0c4dac77edb80f029648afca79d14653

  • postgres_fdw: Allow partitions specified in LIMIT TO to be imported. Commit f49bcd4ef3 disallowed postgres_fdw to import table partitions. Because all data can be accessed through the partitioned table which is the root of the partitioning hierarchy, importing only partitioned table should allow access to all the data without creating extra objects. This is a reasonable default when importing a whole schema. But there may be the case where users want to explicitly import one of a partitioned tables' partitions. For that use case, this commit allows postgres_fdw to import tables or foreign tables which are partitions of some other table only when they are explicitly specified in LIMIT TO clause. It doesn't change the behavior that any partitions not specified in LIMIT TO are automatically excluded in IMPORT FOREIGN SCHEMA command. Author: Matthias van de Meent Reviewed-by: Bernd Helmle, Amit Langote, Michael Paquier, Fujii Masao Discussion: https://postgr.es/m/CAEze2Whwg4i=mzApMe+PXxCEfgoZmHGqdqQFW7J4bmj_5p6t1A@mail.gmail.com https://git.postgresql.org/pg/commitdiff/a3740c48eb2f91663c7c06c948dfcfb6493d2588

  • Fix test added by commit 9de9294b0c. The buildfarm members "drongo" and "fairywren" reported that the regression test (024_archive_recovery.pl) added by commit 9de9294b0c failed. The cause of this failure is that the test calls $node->init() without "allows_streaming => 1" and which doesn't add pg_hba.conf entry for TCP/IP connection from pg_basebackup. This commit fixes the issue by specifying "allows_streaming => 1" when calling $node->init(). Author: Fujii Masao Discussion: https://postgr.es/m/3cc3909d-f779-7a74-c201-f1f7f62c7497@oss.nttdata.com https://git.postgresql.org/pg/commitdiff/8ee9b662daa6d51b54d21ec274f22a218462ad2d

  • Allow TRUNCATE command to truncate foreign tables. This commit introduces new foreign data wrapper API for TRUNCATE. It extends TRUNCATE command so that it accepts foreign tables as the targets to truncate and invokes that API. Also it extends postgres_fdw so that it can issue TRUNCATE command to foreign servers, by adding new routine for that TRUNCATE API. The information about options specified in TRUNCATE command, e.g., ONLY, CACADE, etc is passed to FDW via API. The list of foreign tables to truncate is also passed to FDW. FDW truncates the foreign data sources that the passed foreign tables specify, based on those information. For example, postgres_fdw constructs TRUNCATE command using them and issues it to the foreign server. For performance, TRUNCATE command invokes the FDW routine for TRUNCATE once per foreign server that foreign tables to truncate belong to. Author: Kazutaka Onishi, Kohei KaiGai, slightly modified by Fujii Masao Reviewed-by: Bharath Rupireddy, Michael Paquier, Zhihong Yu, Alvaro Herrera, Stephen Frost, Ashutosh Bapat, Amit Langote, Daniel Gustafsson, Ibrar Ahmed, Fujii Masao Discussion: https://postgr.es/m/CAOP8fzb_gkReLput7OvOK+8NHgw-RKqNv59vem7=524krQTcWA@mail.gmail.com Discussion: https://postgr.es/m/CAJuF6cMWDDqU-vn_knZgma+2GMaout68YUgn1uyDnexRhqqM5Q@mail.gmail.com https://git.postgresql.org/pg/commitdiff/8ff1c94649f5c9184ac5f07981d8aea9dfd7ac19

  • Remove COMMIT_TS_SETTS record. Commit 438fc4a39c prevented the WAL replay from writing COMMIT_TS_SETTS record. By this change there is no code that generates COMMIT_TS_SETTS record in PostgreSQL core. Also we can think that there are no extensions using the record because we've not received so far any complaints about the issue that commit 438fc4a39c fixed. Therefore this commit removes COMMIT_TS_SETTS record and its related code. Even without this record, the timestamp required for commit timestamp feature can be acquired from the COMMIT record. Bump WAL page magic. Reported-by: lx zou zoulx1982@163.com Author: Fujii Masao Reviewed-by: Alvaro Herrera Discussion: https://postgr.es/m/16931-620d0f2fdc6108f1@postgresql.org https://git.postgresql.org/pg/commitdiff/08aa89b326261b669648df97d4f2a6edba22d26a

  • Avoid unnecessary table open/close in TRUNCATE command. ExecuteTruncate() filters out the duplicate tables specified in the TRUNCATE command, for example in the case where "TRUNCATE foo, foo" is executed. Such duplicate tables obviously don't need to be opened and closed because they are skipped. But previously it always opened the tables before checking whether they were duplicated ones or not, and then closed them if they were. That is, the duplicated tables were opened and closed unnecessarily. This commit changes ExecuteTruncate() so that it opens the table after it confirms that table is not duplicated one, which leads to avoid unnecessary table open/close. Do not back-patch because such unnecessary table open/close is not a bug though it exists in older versions. Author: Bharath Rupireddy Reviewed-by: Amul Sul, Fujii Masao Discussion: https://postgr.es/m/CALj2ACUdBO_sXJTa08OZ0YT0qk7F_gAmRa9hT4dxRcgPS4nsZA@mail.gmail.com https://git.postgresql.org/pg/commitdiff/81a23dd87999ec9fb62554328c69c5b678612d56

Stephen Frost pushed:

  • Add pg_read_all_data and pg_write_all_data roles. A commonly requested use-case is to have a role who can run an unfettered pg_dump without having to explicitly GRANT that user access to all tables, schemas, et al, without that role being a superuser. This address that by adding a "pg_read_all_data" role which implicitly gives any member of this role SELECT rights on all tables, views and sequences, and USAGE rights on all schemas. As there may be cases where it's also useful to have a role who has write access to all objects, pg_write_all_data is also introduced and gives users implicit INSERT, UPDATE and DELETE rights on all tables, views and sequences. These roles can not be logged into directly but instead should be GRANT'd to a role which is able to log in. As noted in the documentation, if RLS is being used then an administrator may (or may not) wish to set BYPASSRLS on the login role which these predefined roles are GRANT'd to. Reviewed-by: Georgios Kokolatos Discussion: https://postgr.es/m/20200828003023.GU29590@tamriel.snowman.net https://git.postgresql.org/pg/commitdiff/6c3ffd697e2242f5497ea4b40fffc8f6f922ff60

Peter Geoghegan pushed:

  • Simplify state managed by VACUUM. Reorganize the state struct used by VACUUM -- group related items together to make it easier to understand. Also stop relying on stack variables inside lazy_scan_heap() -- move those into the state struct instead. Doing things this way simplifies large groups of related functions whose function signatures had a lot of unnecessary redundancy. Switch over to using int64 for the struct fields used to count things that are reported to the user via log_autovacuum and VACUUM VERBOSE output. We were using double, but that doesn't seem to have any advantages. Using int64 makes it possible to add assertions that verify that the first pass over the heap (pruning) encounters precisely the same number of LP_DEAD items that get deleted from indexes later on, in the second pass over the heap. These assertions will be added in later commits. Finally, adjust the signatures of functions with IndexBulkDeleteResult pointer arguments in cases where there was ambiguity about whether or not the argument relates to a single index or all indexes. Functions now use the idiom that both ambulkdelete() and amvacuumcleanup() have always used (where appropriate): accept a mutable IndexBulkDeleteResult pointer argument, and return a result IndexBulkDeleteResult pointer to caller. Author: Peter Geoghegan pg@bowt.ie Reviewed-By: Masahiko Sawada sawada.mshk@gmail.com Reviewed-By: Robert Haas robertmhaas@gmail.com Discussion: https://postgr.es/m/CAH2-WzkeOSYwC6KNckbhk2b1aNnWum6Yyn0NKP9D-Hq1LGTDPw@mail.gmail.com https://git.postgresql.org/pg/commitdiff/b4af70cb210393c9c8f41643acf6b213e21178e7

  • Propagate parallel VACUUM's buffer access strategy. Parallel VACUUM relied on global variable state from the leader process being propagated to workers on fork(). Commit b4af70cb removed most uses of global variables inside vacuumlazy.c, but did not account for the buffer access strategy state. To fix, propagate the state through shared memory instead. Per buildfarm failures on elver, curculio, and morepork. Many thanks to Thomas Munro for off-list assistance with this issue. https://git.postgresql.org/pg/commitdiff/49f49defe7c0a330cca084de5da14ccdfdafc6a3

  • Allocate access strategy in parallel VACUUM workers. Commit 49f49def took entirely the wrong approach to fixing this issue. Just allocate a local buffer access strategy in each individual worker instead of trying to propagate state. This state was never propagated by parallel VACUUM in the first place. It looks like the only reason that this worked following commit 40d964ec was that it involved static global variables, which are initialized to 0 per the C standard. A more comprehensive fix may be necessary, even on HEAD. This fix should at least get the buildfarm green once again. Thanks once again to Thomas Munro for continued off-list assistance with the issue. https://git.postgresql.org/pg/commitdiff/f6b8f19a084ce949522fcbc940dc116c034cfc47

  • Refactor lazy_scan_heap() loop. Add a lazy_scan_heap() subsidiary function that handles heap pruning and tuple freezing: lazy_scan_prune(). This is a great deal cleaner. The code that remains in lazy_scan_heap()'s per-block loop can now be thought of as code that either comes before or after the call to lazy_scan_prune(), which is now the clear focal point. This division is enforced by the way in which we now manage state. lazy_scan_prune() outputs state (using its own struct) that describes what to do with the page following pruning and freezing (e.g., visibility map maintenance, recording free space in the FSM). It doesn't get passed any special instructional state from the preamble code, though. Also cleanly separate the logic used by a VACUUM with INDEX_CLEANUP=off from the logic used by single-heap-pass VACUUMs. The former case is now structured as the omission of index and heap vacuuming by a two pass VACUUM. The latter case goes back to being used only when the table happens to have no indexes (just as it was before commit a96c41fe). This structure is much more natural, since the whole point of INDEX_CLEANUP=off is to skip the index and heap vacuuming that would otherwise take place. The single-heap-pass case doesn't skip any useful work, though -- it just does heap pruning and heap vacuuming together when the table happens to have no indexes. Both of these changes are preparation for an upcoming patch that generalizes the mechanism used by INDEX_CLEANUP=off. The later patch will allow VACUUM to give up on index and heap vacuuming dynamically, as problems emerge (e.g., with wraparound), so that an affected VACUUM operation can finish up as soon as possible. Also fix a very old bug in single-pass VACUUM VERBOSE output. We were reporting the number of tuples deleted via pruning as a direct substitute for reporting the number of LP_DEAD items removed in a function that deals with the second pass over the heap. But that doesn't work at all -- they're two different things. To fix, start tracking the total number of LP_DEAD items encountered during pruning, and use that in the report instead. A single pass VACUUM will always vacuum away whatever LP_DEAD items a heap page has immediately after it is pruned, so the total number of LP_DEAD items encountered during pruning equals the total number vacuumed-away. (They are not equal in the INDEX_CLEANUP=off case, but that's okay because skipping index vacuuming is now a totally orthogonal concept to one-pass VACUUM.) Also stop reporting the count of LP_UNUSED items in VACUUM VERBOSE output. This makes the output of VACUUM VERBOSE more consistent with log_autovacuum's output (because it never showed information about LP_UNUSED items). VACUUM VERBOSE reported LP_UNUSED items left behind by the last VACUUM, and LP_UNUSED items created via pruning HOT chains during the current VACUUM (it never included LP_UNUSED items left behind by the current VACUUM's second pass over the heap). This makes it useless as an indicator of line pointer bloat, which must have been the original intention. (Like the first VACUUM VERBOSE issue, this issue was arguably an oversight in commit 282d2a03, which added the heap-only tuple optimization.) Finally, stop reporting empty_pages in VACUUM VERBOSE output, and start reporting pages_removed instead. This also makes the output of VACUUM VERBOSE more consistent with log_autovacuum's output (which does not show empty_pages, but does show pages_removed). An empty page isn't meaningfully different to a page that is almost empty, or a page that is empty but for only a small number of remaining LP_UNUSED items. Author: Peter Geoghegan pg@bowt.ie Reviewed-By: Robert Haas robertmhaas@gmail.com Reviewed-By: Masahiko Sawada sawada.mshk@gmail.com Discussion: https://postgr.es/m/CAH2-WznneCXTzuFmcwx_EyRQgfsfJAAsu+CsqRFmFXCAar=nJw@mail.gmail.com https://git.postgresql.org/pg/commitdiff/7ab96cf6b312cfcd79cdc1a69c6bdb75de0ed30f

  • Remove tupgone special case from vacuumlazy.c. Retry the call to heap_prune_page() in rare cases where there is disagreement between the heap_prune_page() call and the call to HeapTupleSatisfiesVacuum() that immediately follows. Disagreement is possible when a concurrently-aborted transaction makes a tuple DEAD during the tiny window between each step. This was the only case where a tuple considered DEAD by VACUUM still had storage following pruning. VACUUM's definition of dead tuples is now uniformly simple and unambiguous: dead tuples from each page are always LP_DEAD line pointers that were encountered just after we performed pruning (and just before we considered freezing remaining items with tuple storage). Eliminating the tupgone=true special case enables INDEX_CLEANUP=off style skipping of index vacuuming that takes place based on flexible, dynamic criteria. The INDEX_CLEANUP=off case had to know about skipping indexes up-front before now, due to a subtle interaction with the special case (see commit dd695979) -- this was a special case unto itself. Now there are no special cases. And so now it won't matter when or how we decide to skip index vacuuming: it won't affect how pruning behaves, and it won't be affected by any of the implementation details of pruning or freezing. Also remove XLOG_HEAP2_CLEANUP_INFO records. These are no longer necessary because we now rely entirely on heap pruning taking care of recovery conflicts. There is no longer any need to generate recovery conflicts for DEAD tuples that pruning just missed. This also means that heap vacuuming now uses exactly the same strategy for recovery conflicts as index vacuuming always has: REDO routines never need to process a latestRemovedXid from the WAL record, since earlier REDO of the WAL record from pruning is sufficient in all cases. The generic XLOG_HEAP2_CLEAN record type is now split into two new record types to reflect this new division (these are called XLOG_HEAP2_PRUNE and XLOG_HEAP2_VACUUM). Also stop acquiring a super-exclusive lock for heap pages when they're vacuumed during VACUUM's second heap pass. A regular exclusive lock is enough. This is correct because heap page vacuuming is now strictly a matter of setting the LP_DEAD line pointers to LP_UNUSED. No other backend can have a pointer to a tuple located in a pinned buffer that can be invalidated by a concurrent heap page vacuum operation. Heap vacuuming can now be thought of as conceptually similar to index vacuuming and conceptually dissimilar to heap pruning. Heap pruning now has sole responsibility for anything involving the logical contents of the database (e.g., managing transaction status information, recovery conflicts, considering what to do with HOT chains). Index vacuuming and heap vacuuming are now only concerned with recycling garbage items from physical data structures that back the logical database. Bump XLOG_PAGE_MAGIC due to pruning and heap page vacuum WAL record changes. Credit for the idea of retrying pruning a page to avoid the tupgone case goes to Andres Freund. Author: Peter Geoghegan pg@bowt.ie Reviewed-By: Andres Freund andres@anarazel.de Reviewed-By: Masahiko Sawada sawada.mshk@gmail.com Discussion: https://postgr.es/m/CAH2-WznneCXTzuFmcwx_EyRQgfsfJAAsu+CsqRFmFXCAar=nJw@mail.gmail.com https://git.postgresql.org/pg/commitdiff/8523492d4e349c4714aa2ab0291be175a88cb4fc

  • Truncate line pointer array during VACUUM. Teach VACUUM to truncate the line pointer array of each heap page when a contiguous group of LP_UNUSED line pointers appear at the end of the array -- these unused and unreferenced items are excluded. This process occurs during VACUUM's second pass over the heap, right after LP_DEAD line pointers on the page (those encountered/pruned during the first pass) are marked LP_UNUSED. Truncation avoids line pointer bloat with certain workloads, particularly those involving continual range DELETEs and bulk INSERTs against the same table. Also harden heapam code to check for an out-of-range page offset number in places where we weren't already doing so. Author: Matthias van de Meent boekewurm+postgres@gmail.com Author: Peter Geoghegan pg@bowt.ie Reviewed-By: Masahiko Sawada sawada.mshk@gmail.com Reviewed-By: Peter Geoghegan pg@bowt.ie Discussion: https://postgr.es/m/CAEze2WjgaQc55Y5f5CQd3L=eS5CZcff2Obxp=O6pto8-f0hC4w@mail.gmail.com Discussion: https://postgr.es/m/CAH2-Wzn6a64PJM1Ggzm=uvx2otsopJMhFQj_g1rAj4GWr3ZSzw@mail.gmail.com https://git.postgresql.org/pg/commitdiff/3c3b8a4b26891892bccf3d220580a7f413c0b9ca

  • Add wraparound failsafe to VACUUM. Add a failsafe mechanism that is triggered by VACUUM when it notices that the table's relfrozenxid and/or relminmxid are dangerously far in the past. VACUUM checks the age of the table dynamically, at regular intervals. When the failsafe triggers, VACUUM takes extraordinary measures to finish as quickly as possible so that relfrozenxid and/or relminmxid can be advanced. VACUUM will stop applying any cost-based delay that may be in effect. VACUUM will also bypass any further index vacuuming and heap vacuuming -- it only completes whatever remaining pruning and freezing is required. Bypassing index/heap vacuuming is enabled by commit 8523492d, which made it possible to dynamically trigger the mechanism already used within VACUUM when it is run with INDEX_CLEANUP off. It is expected that the failsafe will almost always trigger within an autovacuum to prevent wraparound, long after the autovacuum began. However, the failsafe mechanism can trigger in any VACUUM operation. Even in a non-aggressive VACUUM, where we're likely to not advance relfrozenxid, it still seems like a good idea to finish off remaining pruning and freezing. An aggressive/anti-wraparound VACUUM will be launched immediately afterwards. Note that the anti-wraparound VACUUM that follows will itself trigger the failsafe, usually before it even begins its first (and only) pass over the heap. The failsafe is controlled by two new GUCs: vacuum_failsafe_age, and vacuum_multixact_failsafe_age. There are no equivalent reloptions, since that isn't expected to be useful. The GUCs have rather high defaults (both default to 1.6 billion), and are expected to generally only be used to make the failsafe trigger sooner/more frequently. Author: Masahiko Sawada sawada.mshk@gmail.com Author: Peter Geoghegan pg@bowt.ie Discussion: https://postgr.es/m/CAD21AoD0SkE11fMw4jD4RENAwBMcw1wasVnwpJVw3tVqPOQgAw@mail.gmail.com Discussion: https://postgr.es/m/CAH2-WzmgH3ySGYeC-m-eOBsa2=sDwa292-CFghV4rESYo39FsQ@mail.gmail.com https://git.postgresql.org/pg/commitdiff/1e55e7d1755cefbb44982fbacc7da461fa8684e6

  • Teach VACUUM to bypass unnecessary index vacuuming. VACUUM has never needed to call ambulkdelete() for each index in cases where there are precisely zero TIDs in its dead_tuples array by the end of its first pass over the heap (also its only pass over the heap in this scenario). Index vacuuming is simply not required when this happens. Index cleanup will still go ahead, but in practice most calls to amvacuumcleanup() are usually no-ops when there were zero preceding ambulkdelete() calls. In short, VACUUM has generally managed to avoid index scans when there were clearly no index tuples to delete from indexes. But cases with close to no index tuples to delete were another matter -- a round of ambulkdelete() calls took place (one per index), each of which performed a full index scan. VACUUM now behaves just as if there were zero index tuples to delete in cases where there are in fact "virtually zero" such tuples. That is, it can now bypass index vacuuming and heap vacuuming as an optimization (though not index cleanup). Whether or not VACUUM bypasses indexes is determined dynamically, based on the just-observed number of heap pages in the table that have one or more LP_DEAD items (LP_DEAD items in heap pages have a 1:1 correspondence with index tuples that still need to be deleted from each index in the worst case). We only skip index vacuuming when 2% or less of the table's pages have one or more LP_DEAD items -- bypassing index vacuuming as an optimization must not noticeably impede setting bits in the visibility map. As a further condition, the dead_tuples array (i.e. VACUUM's array of LP_DEAD item TIDs) must not exceed 32MB at the point that the first pass over the heap finishes, which is also when the decision to bypass is made. (The VACUUM must also have been able to fit all TIDs in its maintenance_work_mem-bound dead_tuples space, though with a default maintenance_work_mem setting it can't matter.) This avoids surprising jumps in the duration and overhead of routine vacuuming with workloads where successive VACUUM operations consistently have almost zero dead index tuples. The number of LP_DEAD items may well accumulate over multiple VACUUM operations, before finally the threshold is crossed and VACUUM performs conventional index vacuuming. Even then, the optimization will have avoided a great deal of largely unnecessary index vacuuming. In the future we may teach VACUUM to skip index vacuuming on a per-index basis, using a much more sophisticated approach. For now we only consider the extreme cases, where we can be quite confident that index vacuuming just isn't worth it using simple heuristics. Also log information about how many heap pages have one or more LP_DEAD items when autovacuum logging is enabled. Author: Masahiko Sawada sawada.mshk@gmail.com Author: Peter Geoghegan pg@bowt.ie Discussion: https://postgr.es/m/CAD21AoD0SkE11fMw4jD4RENAwBMcw1wasVnwpJVw3tVqPOQgAw@mail.gmail.com Discussion: https://postgr.es/m/CAH2-WzmkebqPd4MVGuPTOS9bMFvp9MDs5cRTCOsv1rQJ3jCbXw@mail.gmail.com https://git.postgresql.org/pg/commitdiff/5100010ee4d5c8ef46619dbd1d17090c627e6d0a

  • Silence another _bt_check_unique compiler warning. Per complaint from Tom Lane Discussion: https://postgr.es/m/1922884.1617909599@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/796092fb84c08162ae803e83a13aa8bd6d9b23d0

Amit Kapila pushed:

David Rowley pushed:

  • Fix compiler warning in fe-trace.c for MSVC. It seems that in MSVC timeval's tv_sec field is of type long. localtime() takes a time_t pointer. Since long is 32-bit even on 64-bit builds in MSVC, passing a long pointer instead of the correct time_t pointer generated a compiler warning. Fix that. Reviewed-by: Tom Lane Discussion: https://postgr.es/m/CAApHDvoRG25X_=ZCGSPb4KN_j2iu=G2uXsRSg8NBZeuhkOSETg@mail.gmail.com https://git.postgresql.org/pg/commitdiff/9bc9b4609a246ded5caf3f3d4c0013a002ba2323

  • Fix compiler warning for MSVC in libpq_pipeline.c. DEBUG was already defined by the MSVC toolchain for "Debug" builds. On these systems the unconditional #define DEBUG was causing a 'DEBUG': macro redefinition warning. Here we rename DEBUG to DEBUG_OUPUT and also get rid of the #define which defined this constant. This appears to have been left in the code by mistake. Discussion: https://postgr.es/m/CAApHDvqTTgDm38s4HRj03nhzhzQ1oMOj-RXFUB1pE6Bj07jyuQ@mail.gmail.com https://git.postgresql.org/pg/commitdiff/3b82d990ab784881153c0f127e4c1211e9b6065c

  • Cleanup partition pruning step generation. There was some code in gen_prune_steps_from_opexps that needlessly checked a list was not empty when it clearly had to contain at least one item. This prompted a further cleanup operation in partprune.c. Additionally, the previous code could end up adding additional needless INTERSECT steps. However, those do not appear to be able to cause any misbehavior. gen_prune_steps_from_opexps is now no longer in charge of generating combine pruning steps. Instead, gen_partprune_steps_internal, which already does some combine step creation has been given the sole responsibility of generating all combine steps. This means that when we recursively call gen_partprune_steps_internal, since it always now adds a combine step when it produces multiple steps, we can just pay attention to the final step returned. In passing, do quite a bit of work on the comments to try to more clearly explain the role of both gen_partprune_steps_internal and gen_prune_steps_from_opexps. This is fairly complex code so some extra effort to give any new readers an overview of how things work seems like a good idea. Author: Amit Langote Reported-by: Andy Fan Reviewed-by: Kyotaro Horiguchi, Andy Fan, Ryan Lambert, David Rowley Discussion: https://postgr.es/m/CAKU4AWqWoVii+bRTeBQmeVW+PznkdO8DfbwqNsu9Gj4ubt9A6w@mail.gmail.com https://git.postgresql.org/pg/commitdiff/5ac9c4307337313bedeafc21dbbab93ba809241c

  • Speedup ScalarArrayOpExpr evaluation. ScalarArrayOpExprs with "useOr=true" and a set of Consts on the righthand side have traditionally been evaluated by using a linear search over the array. When these arrays contain large numbers of elements then this linear search could become a significant part of execution time. Here we add a new method of evaluating ScalarArrayOpExpr expressions to allow them to be evaluated by first building a hash table containing each element, then on subsequent evaluations, we just probe that hash table to determine if there is a match. The planner is in charge of determining when this optimization is possible and it enables it by setting hashfuncid in the ScalarArrayOpExpr. The executor will only perform the hash table evaluation when the hashfuncid is set. This means that not all cases are optimized. For example CHECK constraints containing an IN clause won't go through the planner, so won't get the hashfuncid set. We could maybe do something about that at some later date. The reason we're not doing it now is from fear that we may slow down cases where the expression is evaluated only once. Those cases can be common, for example, a single row INSERT to a table with a CHECK constraint containing an IN clause. In the planner, we enable this when there are suitable hash functions for the ScalarArrayOpExpr's operator and only when there is at least MIN_ARRAY_SIZE_FOR_HASHED_SAOP elements in the array. The threshold is currently set to 9. Author: James Coleman, David Rowley Reviewed-by: David Rowley, Tomas Vondra, Heikki Linnakangas Discussion: https://postgr.es/m/CAAaqYe8x62+=wn0zvNKCj55tPpg-JBHzhZFFc6ANovdqFw7-dA@mail.gmail.com https://git.postgresql.org/pg/commitdiff/50e17ad281b8d1c1b410c9833955bc80fbad4078

  • Improve slightly misleading comments in nodeFuncs.c. There were some comments in nodeFuncs.c that, depending on your interpretation of the word "result", could lead you to believe that the comments were badly copied and pasted from somewhere else. If you thought of "result" as the return value of the function that the comment is written in, then you'd be misled. However, if you'd correctly interpreted "result" to mean the result type of the given node type, you'd not have seen any issues. Here we do a small cleanup to try to prevent any future misinterpretations. Per wording suggestion from Tom Lane. Reviewed-by: Tom Lane Discussion: https://postgr.es/m/CAApHDvp+Bw=2Qiu5=uXMKfC7gd0+B=4JvexVgGJU=am2g9a1CA@mail.gmail.com https://git.postgresql.org/pg/commitdiff/152d33bccec7176f50be225bdbedf2e6de214e54

Etsuro Fujita pushed:

Dean Rasheed pushed:

  • pgbench: Function to generate random permutations. This adds a new function, permute(), that generates pseudorandom permutations of arbitrary sizes. This can be used to randomly shuffle a set of values to remove unwanted correlations. For example, permuting the output from a non-uniform random distribution so that all the most common values aren't collocated, allowing more realistic tests to be performed. Formerly, hash() was recommended for this purpose, but that suffers from collisions that might alter the distribution, so recommend permute() for this purpose instead. Fabien Coelho and Hironobu Suzuki, with additional hacking be me. Reviewed by Thomas Munro, Alvaro Herrera and Muhammad Usama. Discussion: https://postgr.es/m/alpine.DEB.2.21.1807280944370.5142@lancre https://git.postgresql.org/pg/commitdiff/6b258e3d688db14aadb58dde2a72939362310684

Heikki Linnakangas pushed:

Tomáš Vondra pushed:

  • Fix handling of clauses incompatible with extended statistics. Handling of incompatible clauses while applying extended statistics was a bit confused - while handling a mix of compatible and incompatible clauses it sometimes incorrectly treated the incompatible clauses as compatible, resulting in a crash. Fixed by reworking the code applying the selected statistics object to make it easier to understand, and adding a proper compatibility check. Reported-by: David Rowley Discussion: https://postgr.es/m/CAApHDvpYT10-nkSp8xXe-nbO3jmoaRyRFHbzh-RWMfAJynqgpQ%40mail.gmail.com https://git.postgresql.org/pg/commitdiff/518442c7f334f3b05ea28b7ef50f1b551cfcc23e

  • Don't add non-existent pages to bitmap from BRIN. The code in bringetbitmap() simply added the whole matching page range to the TID bitmap, as determined by pages_per_range, even if some of the pages were beyond the end of the heap. The query then might fail with an error like this: ERROR: could not open file "base/20176/20228.2" (target block 262144): previous segment is only 131021 blocks In this case, the relation has 262093 pages (131072 and 131021 pages), but we're trying to acess block 262144, i.e. first block of the 3rd segment. At that point _mdfd_getseg() notices the preceding segment is incomplete, and fails. Hitting this in practice is rather unlikely, because:

  • Most indexes use power-of-two ranges, so segments and page ranges align perfectly (segment end is also a page range end). * The table size has to be just right, with the last segment being almost full - less than one page range from full segment, so that the last page range actually crosses the segment boundary. * Prefetch has to be enabled. The regular page access checks that pages are not beyond heap end, but prefetch does not. On older releases (before 12) the execution stops after hitting the first non-existent page, so the prefetch distance has to be sufficient to reach the first page in the next segment to trigger the issue. Since 12 it's enough to just have prefetch enabled, the prefetch distance does not matter. Fixed by not adding non-existent pages to the TID bitmap. Backpatch all the way back to 9.6 (BRIN indexes were introduced in 9.5, but that release is EOL). Backpatch-through: 9.6 https://git.postgresql.org/pg/commitdiff/23607a8156d595522c232ff3933d77041d3adaa1

Andres Freund pushed:

Magnus Hagander pushed:

Robert Haas pushed:

  • amcheck: Remove duplicate XID/MXID bounds checks. Commit 3b6c1259f9ca8e21860aaf24ec6735a8e5598ea0 resulted in the same xmin and xmax bounds checking being performed in both check_tuple() and check_tuple_visibility(). Remove the duplication. While at it, adjust some code comments that were overlooked in that commit. Mark Dilger Discussion: http://postgr.es/m/AC5479E4-6321-473D-AC92-5EC36299FBC2@enterprisedb.com https://git.postgresql.org/pg/commitdiff/4573f6a9af6e232ba073392716a051ae2017d1e9

  • amcheck: fix multiple problems with TOAST pointer validation. First, don't perform database access while holding a buffer lock. When checking a heap, we can validate that TOAST pointers are sane by performing a scan on the TOAST index and looking up the chunks that correspond to each value ID that appears in a TOAST poiner in the main table. But, to do that while holding a buffer lock at least risks causing other backends to wait uninterruptibly, and probably can cause undetected and uninterruptible deadlocks. So, instead, make a list of checks to perform while holding the lock, and then perform the checks after releasing it. Second, adjust things so that we don't try to follow TOAST pointers for tuples that are already eligible to be pruned. The TOAST tuples become eligible for pruning at the same time that the main tuple does, so trying to check them may lead to spurious reports of corruption, as observed in the buildfarm. The necessary infrastructure to decide whether or not the tuple being checked is prunable was added by commit 3b6c1259f9ca8e21860aaf24ec6735a8e5598ea0, but it wasn't actually used for its intended purpose prior to this patch. Mark Dilger, adjusted by me to avoid a memory leak. Discussion: http://postgr.es/m/AC5479E4-6321-473D-AC92-5EC36299FBC2@enterprisedb.com https://git.postgresql.org/pg/commitdiff/ec7ffb8096e8eb90f4c9230f7ba9487f0abe1a9f

Bruce Momjian pushed:

Thomas Munro pushed:

Noah Misch pushed:

Pending Patches

Bharath Rupireddy sent in another revision of a patch to add table AMs for multi- and single inserts, and use same for CTAS, REFRESH MATERIALIZED VIEW, and COPY.

Sait Talha Nisanci sent in another revision of a patch intended to fix a bug that manifested as a crash in record_type_typmod_compare.

Bharath Rupireddy sent in two more revisions of a patch to clarify the error message caused by adding a non-table to a publication, naming the type of object it is rather than the type (table) it's not.

Jaime Casanova sent in a patch to use AV worker items infrastructure for GIN pending list's cleanup.

Jürgen Purtz sent in another revision of a patch to add a chapter on architecture to the tutorial.

Andres Freund sent in another revision of a patch to move the stats collector's temporary storage from files to shared memory.

Amul Sul sent in three more revisions of a patch to implement ALTER SYSTEM READ {ONLY | WRITE}.

Michaël Paquier sent in another revision of a patch to make it possible to use NSS as a libpq TLS backend.

Vigneshwaran C, Amit Kapila, and Masahiko Sawada traded patches to use HTAB for replication slot stats.

Bruce Momjian sent in two revisions of a patch to fix and clarify the documentation for some corner cases in interval arithmetic.

Jaime Casanova sent in a patch to document the fact that BRIN's autosummarize parameter is off by default.

Heikki Linnakangas sent in another revision of a patch to simplify COPY FROM parsing by forcing lookahead.

Bruce Momjian sent in another revision of a patch to implement key management.

Amit Langote sent in two more revisions of a patch to allow batching of inserts during cross-partition updates.

Bertrand Drouvot sent in three more revisions of a patch to make it possible to have minimal logical decoding on standbys.

Himanshu Upadhyaya sent in two revisions of a patch to fix an infelicity between PREPARE TRANSACTION and TEMP TABLEs.

Thomas Munro and Kyotaro HORIGUCHI traded patches to remove the read_page callback from XLogReadRecord.

Justin Pryzby sent in two more revisions of a patch to make pg_ls_* show directories and shared filesets.

Hou Zhijie, Masahiko Sawada, Amit Langote, and Shi Yu traded patches to plug a table reference leak in logical replication.

Fabien COELHO and Michaël Paquier traded patches to add a SHOW_ALL_RESULTS option to psql.

Takamichi Osumi sent in another revision of a patch to make it possible to disable WAL logging to speed up data loading.

Yugo Nagata sent in another revision of a patch to implement incremental view maintenance.

Michael Banck sent in another revision of a patch to add a new PGC_ADMINSET GUC context, administrator, and add a pg_change_role_settings predefined role.

Pavel Borisov sent in a patch to ensure same treatment of page header and page special size alignment during page init. Both are now just checked for proper alignment with asserts not MAXALIGNing anything silently. Caller should give properly maxalinged values into page init function.

Thomas Munro sent in another revision of a patch to add a PSQL_WATCH_PAGER setting for psql's \watch command.

Tom Lane sent in three more revisions of a patch to make it possible for psql \df to choose functions by their arguments.

Vigneshwaran C sent in another revision of a patch to identify missing publications from publishers during CREATE/ALTER SUBSCRIPTION.

Ajin Cherian and Amit Kapila traded patches to add missing documentation for streaming in-progress transactions.

Peter Smith sent in three more revisions of a patch to add logical decoding of two-phase transactions.

Thomas Munro sent in another revision of a patch to implement WAL prefetching.

Thomas Munro and Andrey Borodin traded patches to make all SLRU buffer sizes configurable.

Andrey V. Lepikhov sent in another revision of a patch to remove the 64k rangetable limit.

Haotian Wu sent in a patch to add a --drop-cascade to pg_dump/restore.

Bharath Rupireddy sent in a patch to disallow the RESTART option for CREATE SEQUENCE, as it only makes sense in the case of ALTER SEQUENCE.

Andres Freund sent in a patch to fix a race in InvalidateObsoleteReplicationSlots() and re-remove SlotAcquireBehavior.

Bharath Rupireddy sent in three revisions of a patch to simplify the backend terminate and wait logic in the postgres_fdw test.

Justin Pryzby sent in another revision of a patch to change track_activity_query_size from the previously correct Resource Usage / Memory to STATS_COLLECTOR category, make log_autovacuum_min_duration LOGGING_WHAT, make track_commit_timestamp REPLICATION_SENDING, and change force_parallel_mode to a DEVELOPER GUC, and remove it from sample config to help avoid users finding this option and changing it in hopes that it'll make their queries faster, but without reading the documentation or understanding what it does.

Justin Pryzby sent in another revision of a patch to speed up COPY FROM to partitioned tables with foreign partitions.

Thomas Munro sent in a patch to use SIGIO to detect postmaster death.

Pavel Borisov sent in a patch to stabilize the tablespaces test for partitioned indexes. When doing tablespace tests, sometimes (very rarely) the order of parent and child tables changed, which made the test fail.

Maxim Orlov sent in another revision of a patch intended to fix a bug that manifested as SSL negotiation error on massive connect/disconnect.

Andrey V. Lepikhov sent in another revision of a patch to make asymmetric partitionwise joins work more efficiently by teaching the optimizer to consider a partitionwise join of a non-partitioned table with each partition of partitioned table. This technique causes changes to the 'reparameterize by child' machinery.

Michaël Paquier sent in a patch to move tablespace path re-creation from the makefiles to pg_regress.

Pavel Stěhule sent in another revision of a patch to implement schema variables.

Tom Lane sent in a patch intended to fix a bug that manifested as reference leak with type by giving up on associating a long-lived tupdesc refcount with these expression nodes at all, relying instead on the fact that typcache entries never go away once made.

Rémi Lapeyre sent in three more revisions of a patch to add header support to "COPY TO" text format, and a corresponding header matching mode to "COPY FROM".

Justin Pryzby sent in another revision of a patch to ALTER TABLE ... DETACH CONCURRENTLY to avoid creation of redundant constraint.

Pavel Stěhule sent in another revision of a patch to add an --options-file option to pg_dump/pg_restore.

Tom Lane sent in a patch to make PGWARNING and PGERROR universally available.

Peter Geoghegan sent in a patch intended to fix a bug that manifested as PANIC: wrong buffer passed to visibilitymap_clear by acquiring a super-exclusive lock in lazy_vacuum_heap_rel() again.

Ranier Vilela sent in a patch to fix an uninitialized scalar variable in src/backend/statistics/extended_stats.c.

Kategóriák: Informatika

Announcing the release of Apache AGE 0.4.0

2021, április 8 - 02:00
Apache AGE Team has released version 0.4.0 of Apache AGE.

Apache AGE is a PostgreSQL extension that provides graph database functionality. AGE is an acronym for A Graph Extension, and is inspired by Bitnine's fork of PostgreSQL 10, AgensGraph, which is a multi-model database. The goal of the project is to create single storage that can handle both relational and graph model data so that users can use standard ANSI SQL along with openCypher, the Graph query language.

Version 0.4.0 of Apache AGE supports the following:
  • Added Auto Group By aggregation support for RETURN and WITH clauses.
  • Added support for more input types in aggregate functions.
  • Added support for DISTINCT in functions.
  • Added explicit type cast from AGTYPEOID to TEXTOID.
  • Added explicit type casting Agtype to PG's shortint, int, and Bigint
  • Added type casting from AGTYPEOID to INT8OID and FLOATOID. For use with user defined (non-AGE) functions within the cypher query.
  • Added overloading for INT8OID mathematical operators.
  • Added * and optional edge grammar components.
  • Changed input types for typecasting functions to “any”.
  • Added support for the DELETE clause.
  • Added support for Stored Procedures and PL/pgSQL.
  • Added a basic agtype parser for the NodeJS driver.
This release has 2 assets:
  • Source code (zip)
  • Source code (tar.gz) Please see the repository for details : https://github.com/apache/incubator-age/releases/tag/0.4.0 Please feel free to use and send us an inquiry/idea about our project to Eya (eya.abdisho@bitnine.net) (Apache AGE's active Committer) Github Issues : https://github.com/apache/incubator-age/issues
Kategóriák: Informatika

PostgreSQL Weekly News - April 4, 2021

2021, április 5 - 02:00
PostgreSQL Weekly News - April 4, 2021

Person of the week: https://postgresql.life/post/jan_karremans/

PostgreSQL Product News

Ora2Pg 21.1, a tool for migrating Oracle databases to PostgreSQL, released. https://github.com/darold/ora2pg/blob/master/changelog

pgtt 2.3, an extension to implement global temporary tables, released. https://github.com/darold/pgtt/releases/tag/v2.3

SB Data Generator, GUI tool for generating and populating databases with test data, released. SB Data Generator

PostgreSQL Jobs for April

https://archives.postgresql.org/pgsql-jobs/2021-04/

PostgreSQL in the News

Planet PostgreSQL: https://planet.postgresql.org/

PostgreSQL Weekly News is brought to you this week by David Fetter

Submit news and announcements by Sunday at 3:00pm PST8PDT to david@fetter.org.

Applied Patches

David Rowley pushed:

  • Cache if PathTarget and RestrictInfos contain volatile functions. Here we aim to reduce duplicate work done by contain_volatile_functions() by caching whether PathTargets and RestrictInfos contain any volatile functions the first time contain_volatile_functions() is called for them. Any future calls for these nodes just use the cached value rather than going to the trouble of recursively checking the sub-node all over again. Thanks to Tom Lane for the idea. Any locations in the code which make changes to a PathTarget or RestrictInfo which could change the outcome of the volatility check must change the cached value back to VOLATILITY_UNKNOWN again. contain_volatile_functions() is the only code in charge of setting the cache value to either VOLATILITY_VOLATILE or VOLATILITY_NOVOLATILE. Some existing code does benefit from this additional caching, however, this change is mainly aimed at an upcoming patch that must check for volatility during the join search. Repeated volatility checks in that case can become very expensive when the join search contains more than a few relations. Author: David Rowley Discussion: https://postgr.es/m/3795226.1614059027@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/f58b230ed0dba2a3d396794a2ec84541e321d92d

  • Adjust design of per-worker parallel seqscan data struct. The design of the data structures which allow storage of the per-worker memory during parallel seq scans were not ideal. The work done in 56788d215 required an additional data structure to allow workers to remember the range of pages that had been allocated to them for processing during a parallel seqscan. That commit added a void pointer field to TableScanDescData to allow heapam to store the per-worker allocation information. However putting the field there made very little sense given that we have AM specific structs for that, e.g. HeapScanDescData. Here we remove the void pointer field from TableScanDescData and add a dedicated field for this purpose to HeapScanDescData. Previously we also allocated memory for this parallel per-worker data for all scans, regardless if it was a parallel scan or not. This was just a wasted allocation for non-parallel scans, so here we make the allocation conditional on the scan being parallel. Also, add previously missing pfree() to free the per-worker data in heap_endscan(). Reported-by: Andres Freund Reviewed-by: Andres Freund Discussion: https://postgr.es/m/20210317023101.anvejcfotwka6gaa@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/af527705edc3fd0b335264d17e0521c05edc5cca

  • Allow users of simplehash.h to perform direct deletions. Previously simplehash.h only exposed a method to perform a hash table delete using the hash table key. This meant that the delete function had to perform a hash lookup in order to find the entry to delete. Here we add a new function so that users of simplehash.h can perform a hash delete directly using the entry pointer, thus saving the hash lookup. An upcoming patch that uses simplehash.h already has performed the hash lookup so already has the entry pointer. This change will allow the code in that patch to perform the hash delete without the code in simplehash.h having to perform an additional hash lookup. Author: David Rowley Reviewed-by: Andres Freund Discussion: https://postgr.es/m/CAApHDvqFLXXge153WmPsjke5VGOSt7Ez0yD0c7eBXLfmWxs3Kw@mail.gmail.com https://git.postgresql.org/pg/commitdiff/ff53d7b159b93ce9fc884897f9d96b97744781e2

  • Fix compiler warning in unistr function. Some compilers are not aware that elog/ereport ERROR does not return. https://git.postgresql.org/pg/commitdiff/efd9d92bb39c74c2aded64fc08e2d601ce20c39d

  • Allow estimate_num_groups() to pass back further details about the estimation. Here we add a new output parameter to estimate_num_groups() to allow it to inform the caller of additional, possibly useful information about the estimation. The new output parameter is a struct that currently contains just a single field with a set of flags. This was done rather than having the flags as an output parameter to allow future fields to be added without having to change the signature of the function at a later date when we want to pass back further information that might not be suitable to store in the flags field. It seems reasonable that one day in the future that the planner would want to know more about the estimation. For example, how many individual sets of statistics was the estimation generated from? The planner may want to take that into account if we ever want to consider risks as well as costs when generating plans. For now, there's only 1 flag we set in the flags field. This is to indicate if the estimation fell back on using the hard-coded constants in any part of the estimation. Callers may like to change their behavior if this is set, and this gives them the ability to do so. Callers may pass the flag pointer as NULL if they have no interest in obtaining any additional information about the estimate. We're not adding any actual usages of these flags here. Some follow-up commits will make use of this feature. Additionally, we're also not making any changes to add support for clauselist_selectivity() and clauselist_selectivity_ext(). However, if this is required in the future then the same struct being added here should be fine to use as a new output argument for those functions too. Author: David Rowley Discussion: https://postgr.es/m/CAApHDvqQqpk=1W-G_ds7A9CsXX3BggWj_7okinzkLVhDubQzjA@mail.gmail.com https://git.postgresql.org/pg/commitdiff/ed934d4fa30f0f94e6f7125ad2154e6a58d1c7f7

  • Add Result Cache executor node. Here we add a new executor node type named "Result Cache". The planner can include this node type in the plan to have the executor cache the results from the inner side of parameterized nested loop joins. This allows caching of tuples for sets of parameters so that in the event that the node sees the same parameter values again, it can just return the cached tuples instead of rescanning the inner side of the join all over again. Internally, result cache uses a hash table in order to quickly find tuples that have been previously cached. For certain data sets, this can significantly improve the performance of joins. The best cases for using this new node type are for join problems where a large portion of the tuples from the inner side of the join have no join partner on the outer side of the join. In such cases, hash join would have to hash values that are never looked up, thus bloating the hash table and possibly causing it to multi-batch. Merge joins would have to skip over all of the unmatched rows. If we use a nested loop join with a result cache, then we only cache tuples that have at least one join partner on the outer side of the join. The benefits of using a parameterized nested loop with a result cache increase when there are fewer distinct values being looked up and the number of lookups of each value is large. Also, hash probes to lookup the cache can be much faster than the hash probe in a hash join as it's common that the result cache's hash table is much smaller than the hash join's due to result cache only caching useful tuples rather than all tuples from the inner side of the join. This variation in hash probe performance is more significant when the hash join's hash table no longer fits into the CPU's L3 cache, but the result cache's hash table does. The apparent "random" access of hash buckets with each hash probe can cause a poor L3 cache hit ratio for large hash tables. Smaller hash tables generally perform better. The hash table used for the cache limits itself to not exceeding work_mem * hash_mem_multiplier in size. We maintain a dlist of keys for this cache and when we're adding new tuples and realize we've exceeded the memory budget, we evict cache entries starting with the least recently used ones until we have enough memory to add the new tuples to the cache. For parameterized nested loop joins, we now consider using one of these result cache nodes in between the nested loop node and its inner node. We determine when this might be useful based on cost, which is primarily driven off of what the expected cache hit ratio will be. Estimating the cache hit ratio relies on having good distinct estimates on the nested loop's parameters. For now, the planner will only consider using a result cache for parameterized nested loop joins. This works for both normal joins and also for LATERAL type joins to subqueries. It is possible to use this new node for other uses in the future. For example, to cache results from correlated subqueries. However, that's not done here due to some difficulties obtaining a distinct estimation on the outer plan to calculate the estimated cache hit ratio. Currently we plan the inner plan before planning the outer plan so there is no good way to know if a result cache would be useful or not since we can't estimate the number of times the subplan will be called until the outer plan is generated. The functionality being added here is newly introducing a dependency on the return value of estimate_num_groups() during the join search. Previously, during the join search, we only ever needed to perform selectivity estimations. With this commit, we need to use estimate_num_groups() in order to estimate what the hit ratio on the result cache will be. In simple terms, if we expect 10 distinct values and we expect 1000 outer rows, then we'll estimate the hit ratio to be 99%. Since cache hits are very cheap compared to scanning the underlying nodes on the inner side of the nested loop join, then this will significantly reduce the planner's cost for the join. However, it's fairly easy to see here that things will go bad when estimate_num_groups() incorrectly returns a value that's significantly lower than the actual number of distinct values. If this happens then that may cause us to make use of a nested loop join with a result cache instead of some other join type, such as a merge or hash join. Our distinct estimations have been known to be a source of trouble in the past, so the extra reliance on them here could cause the planner to choose slower plans than it did previous to having this feature. Distinct estimations are also fairly hard to estimate accurately when several tables have been joined already or when a WHERE clause filters out a set of values that are correlated to the expressions we're estimating the number of distinct value for. For now, the costing we perform during query planning for result caches does put quite a bit of faith in the distinct estimations being accurate. When these are accurate then we should generally see faster execution times for plans containing a result cache. However, in the real world, we may find that we need to either change the costings to put less trust in the distinct estimations being accurate or perhaps even disable this feature by default. There's always an element of risk when we teach the query planner to do new tricks that it decides to use that new trick at the wrong time and causes a regression. Users may opt to get the old behavior by turning the feature off using the enable_resultcache GUC. Currently, this is enabled by default. It remains to be seen if we'll maintain that setting for the release. Additionally, the name "Result Cache" is the best name I could think of for this new node at the time I started writing the patch. Nobody seems to strongly dislike the name. A few people did suggest other names but no other name seemed to dominate in the brief discussion that there was about names. Let's allow the beta period to see if the current name pleases enough people. If there's some consensus on a better name, then we can change it before the release. Please see the 2nd discussion link below for the discussion on the "Result Cache" name. Author: David Rowley Reviewed-by: Andy Fan, Justin Pryzby, Zhihong Yu Tested-By: Konstantin Knizhnik Discussion: https://postgr.es/m/CAApHDvrPcQyQdWERGYWx8J%2B2DLUNgXu%2BfOSbQ1UscxrunyXyrQ%40mail.gmail.com Discussion: https://postgr.es/m/CAApHDvq=yQXr5kqhRviT2RhNKwToaWr9JAN5t+5_PzhuRJ3wvg@mail.gmail.com https://git.postgresql.org/pg/commitdiff/b6002a796dc0bfe721db5eaa54ba9d24fd9fd416

  • Revert b6002a796. This removes "Add Result Cache executor node". It seems that something weird is going on with the tracking of cache hits and misses as highlighted by many buildfarm animals. It's not yet clear what the problem is as other parts of the plan indicate that the cache did work correctly, it's just the hits and misses that were being reported as 0. This is especially a bad time to have the buildfarm so broken, so reverting before too many more animals go red. Discussion: https://postgr.es/m/CAApHDvq_hydhfovm4=izgWs+C5HqEeRScjMbOgbpC-jRAeK3Yw@mail.gmail.com https://git.postgresql.org/pg/commitdiff/28b3e3905c982c42fb10ee800e6f881e9742c89d

  • Add Result Cache executor node (take 2). Here we add a new executor node type named "Result Cache". The planner can include this node type in the plan to have the executor cache the results from the inner side of parameterized nested loop joins. This allows caching of tuples for sets of parameters so that in the event that the node sees the same parameter values again, it can just return the cached tuples instead of rescanning the inner side of the join all over again. Internally, result cache uses a hash table in order to quickly find tuples that have been previously cached. For certain data sets, this can significantly improve the performance of joins. The best cases for using this new node type are for join problems where a large portion of the tuples from the inner side of the join have no join partner on the outer side of the join. In such cases, hash join would have to hash values that are never looked up, thus bloating the hash table and possibly causing it to multi-batch. Merge joins would have to skip over all of the unmatched rows. If we use a nested loop join with a result cache, then we only cache tuples that have at least one join partner on the outer side of the join. The benefits of using a parameterized nested loop with a result cache increase when there are fewer distinct values being looked up and the number of lookups of each value is large. Also, hash probes to lookup the cache can be much faster than the hash probe in a hash join as it's common that the result cache's hash table is much smaller than the hash join's due to result cache only caching useful tuples rather than all tuples from the inner side of the join. This variation in hash probe performance is more significant when the hash join's hash table no longer fits into the CPU's L3 cache, but the result cache's hash table does. The apparent "random" access of hash buckets with each hash probe can cause a poor L3 cache hit ratio for large hash tables. Smaller hash tables generally perform better. The hash table used for the cache limits itself to not exceeding work_mem * hash_mem_multiplier in size. We maintain a dlist of keys for this cache and when we're adding new tuples and realize we've exceeded the memory budget, we evict cache entries starting with the least recently used ones until we have enough memory to add the new tuples to the cache. For parameterized nested loop joins, we now consider using one of these result cache nodes in between the nested loop node and its inner node. We determine when this might be useful based on cost, which is primarily driven off of what the expected cache hit ratio will be. Estimating the cache hit ratio relies on having good distinct estimates on the nested loop's parameters. For now, the planner will only consider using a result cache for parameterized nested loop joins. This works for both normal joins and also for LATERAL type joins to subqueries. It is possible to use this new node for other uses in the future. For example, to cache results from correlated subqueries. However, that's not done here due to some difficulties obtaining a distinct estimation on the outer plan to calculate the estimated cache hit ratio. Currently we plan the inner plan before planning the outer plan so there is no good way to know if a result cache would be useful or not since we can't estimate the number of times the subplan will be called until the outer plan is generated. The functionality being added here is newly introducing a dependency on the return value of estimate_num_groups() during the join search. Previously, during the join search, we only ever needed to perform selectivity estimations. With this commit, we need to use estimate_num_groups() in order to estimate what the hit ratio on the result cache will be. In simple terms, if we expect 10 distinct values and we expect 1000 outer rows, then we'll estimate the hit ratio to be 99%. Since cache hits are very cheap compared to scanning the underlying nodes on the inner side of the nested loop join, then this will significantly reduce the planner's cost for the join. However, it's fairly easy to see here that things will go bad when estimate_num_groups() incorrectly returns a value that's significantly lower than the actual number of distinct values. If this happens then that may cause us to make use of a nested loop join with a result cache instead of some other join type, such as a merge or hash join. Our distinct estimations have been known to be a source of trouble in the past, so the extra reliance on them here could cause the planner to choose slower plans than it did previous to having this feature. Distinct estimations are also fairly hard to estimate accurately when several tables have been joined already or when a WHERE clause filters out a set of values that are correlated to the expressions we're estimating the number of distinct value for. For now, the costing we perform during query planning for result caches does put quite a bit of faith in the distinct estimations being accurate. When these are accurate then we should generally see faster execution times for plans containing a result cache. However, in the real world, we may find that we need to either change the costings to put less trust in the distinct estimations being accurate or perhaps even disable this feature by default. There's always an element of risk when we teach the query planner to do new tricks that it decides to use that new trick at the wrong time and causes a regression. Users may opt to get the old behavior by turning the feature off using the enable_resultcache GUC. Currently, this is enabled by default. It remains to be seen if we'll maintain that setting for the release. Additionally, the name "Result Cache" is the best name I could think of for this new node at the time I started writing the patch. Nobody seems to strongly dislike the name. A few people did suggest other names but no other name seemed to dominate in the brief discussion that there was about names. Let's allow the beta period to see if the current name pleases enough people. If there's some consensus on a better name, then we can change it before the release. Please see the 2nd discussion link below for the discussion on the "Result Cache" name. Author: David Rowley Reviewed-by: Andy Fan, Justin Pryzby, Zhihong Yu, Hou Zhijie Tested-By: Konstantin Knizhnik Discussion: https://postgr.es/m/CAApHDvrPcQyQdWERGYWx8J%2B2DLUNgXu%2BfOSbQ1UscxrunyXyrQ%40mail.gmail.com Discussion: https://postgr.es/m/CAApHDvq=yQXr5kqhRviT2RhNKwToaWr9JAN5t+5_PzhuRJ3wvg@mail.gmail.com https://git.postgresql.org/pg/commitdiff/9eacee2e62d89cab7b004f97c206c4fba4f1d745

  • Attempt to fix unstable Result Cache regression tests. force_parallel_mode = regress is causing a few more problems than I thought. It seems that both the leader and the single worker can contribute to the execution. I had mistakenly thought that only the worker process would do any work. Since it's not deterministic as to which of the two processes will get a chance to work on the plan, it seems just better to disable force_parallel_mode for these tests. At least doing this seems better than changing to EXPLAIN only rather than EXPLAIN ANALYZE. Additionally, I overlooked the fact that the number of executions of the sub-plan below a Result Cache will execute a varying number of times depending on cache eviction. 32-bit machines will use less memory and evict fewer tuples from the cache. That results in the subnode being executed fewer times on 32-bit machines. Let's just blank out the number of loops in each node. https://git.postgresql.org/pg/commitdiff/a4fac4ffe8f8d543a10ac7debf1157e34963ece3

  • Remove useless Asserts in Result Cache code. Testing if an unsigned variable is >= 0 is pretty pointless. There's likely enough code in remove_cache_entry() to verify the cache memory accounting is correct in assert enabled builds. These Asserts were not adding much extra cover, even if they had been checking >= 0 on a signed variable. Reported-by: Andres Freund Discussion: https://postgr.es/m/20210402204734.6mo3nfacnljlicgn@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/1267d9862fc6a4f8cdc0ca38d1988b61f39da585

Peter Geoghegan pushed:

Peter Eisentraut pushed:

Andrew Dunstan pushed:

Álvaro Herrera pushed:

Etsuro Fujita pushed:

Amit Kapila pushed:

Tom Lane pushed:

  • Further tweaking of pg_dump's handling of default_toast_compression. As committed in bbe0a81db, pg_dump from a pre-v14 server effectively acts as though you'd said --no-toast-compression. I think the right thing is for it to act as though default_toast_compression is set to "pglz", instead, so that the tables' toast compression behavior is preserved. You can always get the other behavior, if you want that, by giving the switch. Discussion: https://postgr.es/m/1112852.1616609702@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/54bb91c30e3964fd81059e6b02e377cc9dd2d64c

  • Remove small inefficiency in ExecARDeleteTriggers/ExecARUpdateTriggers. Whilst poking at nodeModifyTable.c, I chanced to notice that while its calls to ExecBRTriggers and ExecIRTriggers are protected by tests to see if there are any relevant triggers to fire, its calls to ExecARTriggers are not; the latter functions do the equivalent tests themselves. This seems possibly reasonable given the more complex conditions involved, but what's less reasonable is that the ExecAR functions aren't careful to do no work when there is no work to be done. ExecARInsertTriggers gets this right, but the other two will both force creation of a slot that the query may have no use for. ExecARUpdateTriggers additionally performed a usually-useless ExecClearTuple() on that slot. This is probably all pretty microscopic in real workloads, but a cycle shaved is a cycle earned. https://git.postgresql.org/pg/commitdiff/65158f497a7d7523ad438b2034d01a560fafe6bd

  • Rework planning and execution of UPDATE and DELETE. This patch makes two closely related sets of changes: 1. For UPDATE, the subplan of the ModifyTable node now only delivers the new values of the changed columns (i.e., the expressions computed in the query's SET clause) plus row identity information such as CTID. ModifyTable must re-fetch the original tuple to merge in the old values of any unchanged columns. The core advantage of this is that the changed columns are uniform across all tables of an inherited or partitioned target relation, whereas the other columns might not be. A secondary advantage, when the UPDATE involves joins, is that less data needs to pass through the plan tree. The disadvantage of course is an extra fetch of each tuple to be updated. However, that seems to be very nearly free in context; even worst-case tests don't show it to add more than a couple percent to the total query cost. At some point it might be interesting to combine the re-fetch with the tuple access that ModifyTable must do anyway to mark the old tuple dead; but that would require a good deal of refactoring and it seems it wouldn't buy all that much, so this patch doesn't attempt it. 2. For inherited UPDATE/DELETE, instead of generating a separate subplan for each target relation, we now generate a single subplan that is just exactly like a SELECT's plan, then stick ModifyTable on top of that. To let ModifyTable know which target relation a given incoming row refers to, a tableoid junk column is added to the row identity information. This gets rid of the horrid hack that was inheritance_planner(), eliminating O(N^2) planning cost and memory consumption in cases where there were many unprunable target relations. Point 2 of course requires point 1, so that there is a uniform definition of the non-junk columns to be returned by the subplan. We can't insist on uniform definition of the row identity junk columns however, if we want to keep the ability to have both plain and foreign tables in a partitioning hierarchy. Since it wouldn't scale very far to have every child table have its own row identity column, this patch includes provisions to merge similar row identity columns into one column of the subplan result. In particular, we can merge the whole-row Vars typically used as row identity by FDWs into one column by pretending they are type RECORD. (It's still okay for the actual composite Datums to be labeled with the table's rowtype OID, though.) There is more that can be done to file down residual inefficiencies in this patch, but it seems to be committable now. FDW authors should note several API changes: The argument list for AddForeignUpdateTargets() has changed, and so has the method it must use for adding junk columns to the query. Call add_row_identity_var() instead of manipulating the parse tree directly. You might want to reconsider exactly what you're adding, too. PlanDirectModify() must now work a little harder to find the ForeignScan plan node; if the foreign table is part of a partitioning hierarchy then the ForeignScan might not be the direct child of ModifyTable. See postgres_fdw for sample code. * To check whether a relation is a target relation, it's no longer sufficient to compare its relid to root->parse->resultRelation. Instead, check it against all_result_relids or leaf_result_relids, as appropriate. Amit Langote and Tom Lane Discussion: https://postgr.es/m/CA+HiwqHpHdqdDn48yCEhynnniahH78rwcrv1rEX65-fsZGBOLQ@mail.gmail.com https://git.postgresql.org/pg/commitdiff/86dc90056dfdbd9d1b891718d2e5614e3e432f35

  • Improve style of some replication-related error messages. Put the remote end's error message into the primary error string, instead of relegating it to errdetail(). Although this could end up being awkward if the remote sends us a really long error message, it seems more in keeping with our message style guidelines, and more helpful in situations where the errdetail could get dropped. Peter Smith Discussion: https://postgr.es/m/CAHut+Ps-Qv2yQceCwobQDP0aJOkfDzRFrOaR6+2Op2K=WHGeWg@mail.gmail.com https://git.postgresql.org/pg/commitdiff/6197db5340b8154adce1c6d07f6d3325547429c1

  • Suppress compiler warning in libpq_pipeline.c. Some compilers seem to be concerned about the possibility that recv_step is not any of the defined enum values. Silence warnings about uninitialized cmdtag in a different way than I did in 9fb9691a8. https://git.postgresql.org/pg/commitdiff/522d1a89f8d7ed45681988c60bd0a687332a4023

  • Don't prematurely cram a value into a short int. Since a4d75c86b, some buildfarm members have been warning that Assert(attnum <= MaxAttrNumber); is useless if attnum is an AttrNumber. I'm not certain how plausible it is that the value coming out of the bitmap could actually exceed MaxAttrNumber, but we seem to have thought that that was possible back in 7300a6995. Revert the intermediate variable to int so that we have the same overflow protection as before. https://git.postgresql.org/pg/commitdiff/c545e9524dcfcfce25c370f584b31562e8d7a4b7

  • Silence compiler warning in non-assert builds. Per buildfarm. https://git.postgresql.org/pg/commitdiff/8998e3cafa23632790787b8cc726998e84067259

  • Fix portability and safety issues in pqTraceFormatTimestamp. Remove confusion between time_t and pg_time_t; neither gettimeofday() nor localtime() deal in the latter. libpq indeed has no business using <pgtime.h> at all. Use snprintf not sprintf, to ensure we can't overrun the supplied buffer. (Unlikely, but let's be safe.) Per buildfarm. https://git.postgresql.org/pg/commitdiff/f1be740a991406d7885047beb971e1ff5dbe8b71

  • Fix unportable use of isprint(). We must cast the arguments of <ctype.h> functions to unsigned char to avoid problems where char is signed. Speaking of which, considering that this is a <ctype.h> function, it's rather remarkable that we aren't seeing more complaints about not having included that header. Per buildfarm. https://git.postgresql.org/pg/commitdiff/9e20406dd847d0f8c1cbd803786c6d0ad33bcbdd

  • Fix pg_restore's misdesigned code for detecting archive file format. Despite the clear comments pointing out that the duplicative code segments in ReadHead() and _discoverArchiveFormat() needed to be in sync, they were not: the latter did not bother to apply any of the sanity checks in the former. We'd missed noticing this partly because none of those checks would fail in scenarios we customarily test, and partly because the oversight would be masked if both segments execute, which they would in cases other than needing to autodetect the format of a non-seekable stdin source. However, in a case meeting all these requirements --- for example, trying to read a newer-than-supported archive format from non-seekable stdin --- pg_restore missed applying the version check and would likely dump core or otherwise misbehave. The whole thing is silly anyway, because there seems little reason to duplicate the logic beyond the one-line verification that the file starts with "PGDMP". There seems to have been an undocumented assumption that multiple major formats (major enough to require separate reader modules) would nonetheless share the first half-dozen fields of the custom-format header. This seems unlikely, so let's fix it by just nuking the duplicate logic in _discoverArchiveFormat(). Also get rid of the pointless attempt to seek back to the start of the file after successful autodetection. That wastes cycles and it means we have four behaviors to verify not two. Per bug #16951 from Sergey Koposov. This has been broken for decades, so back-patch to all supported versions. Discussion: https://postgr.es/m/16951-a4dd68cf0de23048@postgresql.org https://git.postgresql.org/pg/commitdiff/ec03f2df17a8ba5b431b34dd924e020a0be729f6

  • Rethink handling of pass-by-value leaf datums in SP-GiST. The existing convention in SP-GiST is that any pass-by-value datatype is stored in Datum representation, i.e. it's of width sizeof(Datum) even when typlen is less than that. This is okay, or at least it's too late to change it, for prefix datums and node-label datums in inner (upper) tuples. But it's problematic for leaf datums, because we'd prefer those to be stored in Postgres' standard on-disk representation so that we can easily extend leaf tuples to carry additional "included" columns. I believe, however, that we can get away with just up and changing that. This would be an unacceptable on-disk-format break, but there are two big mitigating factors: 1. It seems quite unlikely that there are any SP-GiST opclasses out there that use pass-by-value leaf datatypes. Certainly none of the ones in core do, nor has codesearch.debian.net heard of any. Given what SP-GiST is good for, it's hard to conceive of a use-case where the leaf-level values would be both small and fixed-width. (As an example, if you wanted to index text values with the leaf level being just a byte, then every text string would have to be represented with one level of inner tuple per preceding byte, which would be horrendously space-inefficient and slow to access. You always want to use as few inner-tuple levels as possible, leaving as much as possible in the leaf values.) 2. Even granting that you have such an index, this change only breaks things on big-endian machines. On little-endian, the high order bytes of the Datum format will now just appear to be alignment padding space. So, change the code to store pass-by-value leaf datums in their usual on-disk form. Inner-tuple datums are not touched. This is extracted from a larger patch that intends to add support for "included" columns. I'm committing it separately for visibility in our commit logs. Pavel Borisov and Tom Lane, reviewed by Andrey Borodin Discussion: https://postgr.es/m/CALT9ZEFi-vMp4faht9f9Junb1nO3NOSjhpxTmbm1UGLMsLqiEQ@mail.gmail.com https://git.postgresql.org/pg/commitdiff/1ebdec8c03294e55a9fdb6e676a9e8de680231cc

  • Strip file names reported in error messages on Windows, too. Commit dd136052b established a policy that error message FILE items should include only the base name of the reporting source file, for uniformity and succinctness. We now observe that some Windows compilers use backslashes in FILE strings, so truncate at backslashes as well. This is expected to fix some platform variation in the results of the new libpq_pipeline test module. Discussion: https://postgr.es/m/3650140.1617372290@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/53aafdb9ff6a561c7dea0f428a7c168f2b7e0f16

  • Improve psql's behavior when the editor is exited without saving. When editing the previous query buffer, if the editor is exited without modifying the temp file then clear the query buffer, rather than re-loading (and probably re-executing) the previous query buffer. This reduces the probability of accidentally re-executing something you didn't intend to. Similarly, in "\e file", if the file isn't actually modified then don't load it into the query buffer. And in "\ef" and "\ev", if no changes are made then clear the query buffer instead of loading the function or view definition into it. Cases where we fail to invoke the editor at all, or it returns a nonzero status, are treated like the no-file-modification case. Laurenz Albe, reviewed by Jacob Champion Discussion: https://postgr.es/m/0ba3f2a658bac6546d9934ab6ba63a805d46a49b.camel@cybertec.at https://git.postgresql.org/pg/commitdiff/55873a00e3c3349664e7215077dca74ccea08b4d

  • Fix confusion in SP-GiST between attribute type and leaf storage type. According to the documentation, the attType passed to the opclass config function (and also relied on by the core code) is the type of the heap column or expression being indexed. But what was actually being passed was the type stored for the index column. This made no difference for user-defined SP-GiST opclasses, because we weren't allowing the STORAGE clause of CREATE OPCLASS to be used, so the two types would be the same. But it's silly not to allow that, seeing that the built-in poly_ops opclass has a different value for opckeytype than opcintype, and that if you want to do lossy storage then the types must really be different. (Thus, user-defined opclasses doing lossy storage had to lie about what type is in the index.) Hence, remove the restriction, and make sure that we use the input column type not opckeytype where relevant. For reasons of backwards compatibility with existing user-defined opclasses, we can't quite insist that the specified leafType match the STORAGE clause; instead just add an amvalidate() warning if they don't match. Also fix some bugs that would only manifest when trying to return index entries when attType is different from attLeafType. It's not too surprising that these have not been reported, because the only usual reason for such a difference is to store the leaf value lossily, rendering index-only scans impossible. Add a src/test/modules module to exercise cases where attType is different from attLeafType and yet index-only scan is supported. Discussion: https://postgr.es/m/3728741.1617381471@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/ac9099fc1dd460bffaafec19272159dd7bc86f5b

Stephen Frost pushed:

Bruce Momjian pushed:

Michaël Paquier pushed:

  • Add support for --extension in pg_dump. When specified, only extensions matching the given pattern are included in dumps. Similarly to --table and --schema, when --strict-names is used, a perfect match is required. Also, like the two other options, this new option offers no guarantee that dependent objects have been dumped, so a restore may fail on a clean database. Tests are added in test_pg_dump/, checking after a set of positive and negative cases, with or without an extension's contents added to the dump generated. Author: Guillaume Lelarge Reviewed-by: David Fetter, Tom Lane, Michael Paquier, Asif Rehman, Julien Rouhaud Discussion: https://postgr.es/m/CAECtzeXOt4cnMU5+XMZzxBPJ_wu76pNy6HZKPRBL-j7yj1E4+g@mail.gmail.com https://git.postgresql.org/pg/commitdiff/6568cef26e0f40c25ae54b8e20aad8d1410a854b

  • Fix comment in parsenodes.h. CreateStmt->inhRelations is a list of RangeVars, but a comment was incorrect about that. Author: Julien Rouhaud Discussion: https://postgr.es/m/20210330123015.yzekhz5sweqbgxdr@nol https://git.postgresql.org/pg/commitdiff/7ef64e7e72a65f191fc2f7d4bbe220f53dd8d5de

  • Move some client-specific routines from SSLServer to PostgresNode. test_connect_ok() and test_connect_fails() have always been part of the SSL tests, and check if a connection to the backend should work or not, and there are sanity checks done on specific error patterns dropped by libpq if the connection fails. This was fundamentally wrong on two aspects. First, SSLServer.pm works mostly on setting up and changing the SSL configuration of a PostgresNode, and has really nothing to do with the client. Second, the situation became worse in light of b34ca595, where the SSL tests would finish by using a psql command that may not come from the same installation as the node set up. This commit moves those client routines into PostgresNode, making easier the refactoring of SSLServer to become more SSL-implementation aware. This can also be reused by the ldap, kerberos and authentication test suites for connection checks, and a follow-up patch should extend those interfaces to match with backend log patterns. Author: Michael Paquier Reviewed-by: Andrew Dunstan, Daniel Gustafsson, Álvaro Herrera Discussion: https://postgr.es/m/YGLKNBf9zyh6+WSt@paquier.xyz https://git.postgresql.org/pg/commitdiff/0d1a33438d3a88938264e12e94c22818307d2f4d

  • doc: Clarify use of ACCESS EXCLUSIVE lock in various sections. Some sections of the documentation used "exclusive lock" to describe that an ACCESS EXCLUSIVE lock is taken during a given operation. This can be confusing to the reader as ACCESS SHARE is allowed with an EXCLUSIVE lock is used, but that would not be the case with what is described on those parts of the documentation. Author: Greg Rychlewski Discussion: https://postgr.es/m/CAKemG7VptD=7fNWckFMsMVZL_zzvgDO6v2yVmQ+ZiBfc_06kCQ@mail.gmail.com Backpatch-through: 9.6 https://git.postgresql.org/pg/commitdiff/ffd3391ea94165fbb5adc9534894c62d41138505

  • Improve stability of test with vacuum_truncate in reloptions.sql. This test has been using a simple VACUUM with pg_relation_size() to check if a relation gets physically truncated or not, but forgot the fact that some concurrent activity, like checkpoint buffer writes, could cause some pages to be skipped. The second test enabling vacuum_truncate could fail, seeing a non-empty relation. The first test would not have failed, but could finish by testing a behavior different than the one aimed for. Both tests gain a FREEZE option, to make the vacuums more aggressive and prevent page skips. This is similar to the issues fixed in c2dc1a7. Author: Arseny Sher Reviewed-by: Masahiko Sawada Discussion: https://postgr.es/m/87tuotr2hh.fsf@ars-thinkpad backpatch-through: 12 https://git.postgresql.org/pg/commitdiff/fe246d1c111d43fd60a1b0afff25ed09b7ae11eb

  • doc: Clarify how to generate backup files with non-exclusive backups. The current instructions describing how to write the backup_label and tablespace_map files are confusing. For example, opening a file in text mode on Windows and copy-pasting the file's contents would result in a failure at recovery because of the extra CRLF characters generated. The documentation was not stating that clearly, and per discussion this is not considered as a supported scenario. This commit extends a bit the documentation to mention that it may be required to open the file in binary mode before writing its data. Reported-by: Wang Shenhao Author: David Steele Reviewed-by: Andrew Dunstan, Magnus Hagander Discussion: https://postgr.es/m/8373f61426074f2cb6be92e02f838389@G08CNEXMBPEKD06.g08.fujitsu.local Backpatch-through: 9.6 https://git.postgresql.org/pg/commitdiff/6fb66c268df2de1112cac3cf0a6cf0a8b96ceaf0

  • Refactor HMAC implementations. Similarly to the cryptohash implementations, this refactors the existing HMAC code into a single set of APIs that can be plugged with any crypto libraries PostgreSQL is built with (only OpenSSL currently). If there is no such libraries, a fallback implementation is available. Those new APIs are designed similarly to the existing cryptohash layer, so there is no real new design here, with the same logic around buffer bound checks and memory handling. HMAC has a dependency on cryptohashes, so all the cryptohash types supported by cryptohash{_openssl}.c can be used with HMAC. This refactoring is an advantage mainly for SCRAM, that included its own implementation of HMAC with SHA256 without relying on the existing crypto libraries even if PostgreSQL was built with their support. This code has been tested on Windows and Linux, with and without OpenSSL, across all the versions supported on HEAD from 1.1.1 down to 1.0.1. I have also checked that the implementations are working fine using some sample results, a custom extension of my own, and doing cross-checks across different major versions with SCRAM with the client and the backend. Author: Michael Paquier Reviewed-by: Bruce Momjian Discussion: https://postgr.es/m/X9m0nkEJEzIPXjeZ@paquier.xyz https://git.postgresql.org/pg/commitdiff/e6bdfd9700ebfc7df811c97c2fc46d7e94e329a2

  • Use more verbose matching patterns for errors in SSL TAP tests. The TAP tests of src/test/ssl/ have been using rather generic matching patterns to check some failure scenarios, like "SSL error" or just "FATAL". These have been introduced in 081bfc1. Those messages are not wrong per se, but when working on the integration of new SSL libraries it becomes hard to know if those errors are legit or not, and existing scenarios may fail in incorrect ways. This commit makes all those messages more verbose by adding the information generated by OpenSSL. Fortunately, the same error messages are used for all the versions supported on HEAD (checked that after running the tests from 1.0.1 to 1.1.1), so the change is straight-forward. Reported-by: Jacob Champion, Álvaro Herrera Discussion: https://postgr.es/m/YGU3AxQh0zBMMW8m@paquier.xyz https://git.postgresql.org/pg/commitdiff/8d3a4c3eae5367fba60ab77c159814defba784fe

Noah Misch pushed:

Joe Conway pushed:

  • Fix has_column_privilege function corner case. According to the comments, when an invalid or dropped column oid is passed to has_column_privilege(), the intention has always been to return NULL. However, when the caller had table level privilege the invalid/missing column was never discovered, because table permissions were checked first. Fix that by introducing extended versions of pg_attribute_acl(check|mask) and pg_class_acl(check|mask) which take a new argument, is_missing. When is_missing is NULL, the old behavior is preserved. But when is_missing is passed by the caller, no ERROR is thrown for dropped or missing columns/relations, and is_missing is flipped to true. This in turn allows has_column_privilege to check for column privileges first, providing the desired semantics. Not backpatched since it is a user visible behavioral change with no previous complaints, and the fix is a bit on the invasive side. Author: Joe Conway Reviewed-By: Tom Lane Reported by: Ian Barwick Discussion: https://postgr.es/m/flat/9b5f4311-157b-4164-7fe7-077b4fe8ed84%40joeconway.com https://git.postgresql.org/pg/commitdiff/b12bd4869b5e64b742a69ca07915e2f77f85a9ae

  • Clarify documentation of RESET ROLE. Command-line options, or previous "ALTER (ROLE|DATABASE) ... SET ROLE ..." commands, can change the value of the default role for a session. In the presence of one of these, RESET ROLE will change the current user identifier to the default role rather than the session user identifier. Fix the documentation to reflect this reality. Backpatch to all supported versions. Author: Nathan Bossart Reviewed-By: Laurenz Albe, David G. Johnston, Joe Conway Reported by: Nathan Bossart Discussion: https://postgr.es/m/flat/925134DB-8212-4F60-8AB1-B1231D750CB4%40amazon.com Backpatch-through: 9.6 https://git.postgresql.org/pg/commitdiff/174edbe9f9c1538ab3347474e96d176223591cd1

Heikki Linnakangas pushed:

  • Add 'noError' argument to encoding conversion functions. With the 'noError' argument, you can try to convert a buffer without knowing the character boundaries beforehand. The functions now need to return the number of input bytes successfully converted. This is is a backwards-incompatible change, if you have created a custom encoding conversion with CREATE CONVERSION. This adds a check to pg_upgrade for that, refusing the upgrade if there are any user-defined encoding conversions. Custom conversions are very rare, there are no commonly used extensions that I know of that uses that feature. No other objects can depend on conversions, so if you do have one, you can fairly easily drop it before upgrading, and recreate it after the upgrade with an updated version. Add regression tests for built-in encoding conversions. This doesn't cover every conversion, but it covers all the internal functions in conv.c that are used to implement the conversions. Reviewed-by: John Naylor Discussion: https://www.postgresql.org/message-id/e7861509-3960-538a-9025-b75a61188e01%40iki.fi https://git.postgresql.org/pg/commitdiff/ea1b99a6619cd9dcfd46b82ac0d926b0b80e0ae9

  • Do COPY FROM encoding conversion/verification in larger chunks. This gives a small performance gain, by reducing the number of calls to the conversion/verification function, and letting it work with larger inputs. Also, reorganizing the input pipeline makes it easier to parallelize the input parsing: after the input has been converted to the database encoding, the next stage of finding the newlines can be done in parallel, because there cannot be any newline chars "embedded" in multi-byte characters in the encodings that we support as server encodings. This changes behavior in one corner case: if client and server encodings are the same single-byte encoding (e.g. latin1), previously the input would not be checked for zero bytes ('\0'). Any fields containing zero bytes would be truncated at the zero. But if encoding conversion was needed, the conversion routine would throw an error on the zero. After this commit, the input is always checked for zeros. Reviewed-by: John Naylor Discussion: https://www.postgresql.org/message-id/e7861509-3960-538a-9025-b75a61188e01%40iki.fi https://git.postgresql.org/pg/commitdiff/f82de5c46bdf8cd65812a7b04c9509c218e1545d

Robert Haas pushed:

Fujii Masao pushed:

Thomas Munro pushed:

Andres Freund pushed:

  • Split wait event related code from pgstat.[ch] into wait_event.[ch]. The wait event related code is independent from the rest of the pgstat.[ch] code, of nontrivial size and changes on a regular basis. Put it into its own set of files. As there doesn't seem to be a good pre-existing directory for code like this, add src/backend/utils/activity. Reviewed-By: Robert Haas robertmhaas@gmail.com Discussion: https://postgr.es/m/20210316195440.twxmlov24rr2nxrg@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/a333476b925134f6185037eaff3424c07a9f466f

  • Do not rely on pgstat.h to indirectly include storage/ headers. An upcoming patch might remove the (now indirect) proc.h include (which in turn includes other headers), and it's cleaner for the modified files to include their dependencies directly anyway... Discussion: https://postgr.es/m/20210402194458.2vu324hkk2djq6ce@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/1d9c5d0ce2dcac05850401cf266a9df10a68de49

  • Split backend status and progress related functionality out of pgstat.c. Backend status (supporting pg_stat_activity) and command progress (supporting pg_stat_progress*) related code is largely independent from the rest of pgstat.[ch] (supporting views like pg_stat_all_tables that accumulate data over time). See also a333476b925. This commit doesn't rename the function names to make the distinction from the rest of pgstat_ clearer - that'd be more invasive and not clearly beneficial. If we were to decide to do such a rename at some point, it's better done separately from moving the code as well. Robert's review was of an earlier version. Reviewed-By: Robert Haas robertmhaas@gmail.com Discussion: https://postgr.es/m/20210316195440.twxmlov24rr2nxrg@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/e1025044cd4e7f33f7304aed54d5778b8a82cd5d

  • Improve efficiency of wait event reporting, remove proc.h dependency. pgstat_report_wait_start() and pgstat_report_wait_end() required two conditional branches so far. One to check if MyProc is NULL, the other to check if pgstat_track_activities is set. As wait events are used around comparatively lightweight operations, and are inlined (reducing branch predictor effectiveness), that's not great. The dependency on MyProc has a second disadvantage: Low-level subsystems, like storage/file/fd.c, report wait events, but architecturally it is preferable for them to not depend on inter-process subsystems like proc.h (defining PGPROC). After this change including pgstat.h (nor obviously its sub-components like backend_status.h, wait_event.h, ...) does not pull in IPC related headers anymore. These goals, efficiency and abstraction, are achieved by having pgstat_report_wait_start/end() not interact with MyProc, but instead a new my_wait_event_info variable. At backend startup it points to a local variable, removing the need to check for MyProc being NULL. During process initialization my_wait_event_info is redirected to MyProc->wait_event_info. At shutdown this is reversed. Because wait event reporting now does not need to know about where the wait event is stored, it does not need to know about PGPROC anymore. The removal of the branch for checking pgstat_track_activities is simpler: Don't check anymore. The cost due to the branch are often higher than the store - and even if not, pgstat_track_activities is rarely disabled. The main motivator to commit this work now is that removing the (indirect) pgproc.h include from pgstat.h simplifies a patch to move statistics reporting to shared memory (which still has a chance to get into 14). Author: Andres Freund andres@anarazel.de Discussion: https://postgr.es/m/20210402194458.2vu324hkk2djq6ce@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/225a22b19ed2960acc8e9c0b7ae53e0e5b0eac87

Tomáš Vondra pushed:

Pending Patches

James Hilliard sent in another revision of a patch to fix detection of preadv/pwritev support for OSX.

Mark Rofail sent in another revision of a patch to implement foreign key arrays.

Tomáš Vondra sent in a patch to combine statistics from child relations using a new subcommand, ANALYZE (MERGE).

Zeng Wenjing sent in another revision of a patch to implement global temporary tables.

Marcus Wanner sent in four more revisions of a patch to add an xid argument to the filter_prepare callback for output plugins.

Euler Taveira de Oliveira sent in another revision of a patch to add row filtering specified by a WHERE clause for logical replication.

Peter Smith sent in another revision of a patch to add support for prepared transactions to built-in logical replication.

Arne Roland sent in two more revisions of a patch to make ALTER TRIGGER ... RENAME TO work on partitioned tables.

Tang sent in a patch to update the copyright year for nbtsearch.c.

Paul Guo sent in another revision of a patch to support node initialization from backup with tablespaces, fix the replay of create database records on standby, and fix database create/drop wal description.

Masahiro Ikeda sent in two more revisions of a patch to speed up reporting of WAL stats.

Daniil Zakhlystov sent in two more revisions of a patch to add zlib and zstd streaming compression, and implement libpq compression.

Atsushi Torikoshi and Fujii Masao traded patches to get memory contexts of an arbitrary backend process.

John Naylor sent in two revisions of a patch to document the recently added date_bin() function.

Dean Rasheed and Fabien COELHO traded patches to add a pseudo-random permutation function to pgbench.

Isaac Moreland sent in a patch to add an abs(interval) function and the related @ operator.

Kyotaro HORIGUCHI sent in a patch to make the box type's description clearer.

Vigneshwaran C sent in another revision of a patch to fail a prepared transaction if it has locked system tables/user catalog tables.

Douglas Hirn sent in another revision of a patch to allow multiple linear recursive self-references in WITH RECURSIVE.

Sait Talha Nisanci sent in a patch intended to fix a bug that manifested as crash in record_type_typmod_compare.

Tomáš Vondra sent in a patch to use extended statistics to improve join estimates.

Stephen Frost sent in another revision of a patch to rename default roles to predefined roles.

Vigneshwaran C sent in three revisions of a patch to handle the overwriting of replication slot statistic issue, and add total txns and total txn bytes to replication statistics.

Peter Geoghegan sent in two more revisions of a patch to simplify the state managed by VACUUM, refactor lazy_scan_heap(), remove the tupgone special case from vacuumlazy.c, truncate line pointer array during VACUUM, and bypass index vacuuming in some cases.

Peter Geoghegan and Matthias van de Meent traded patches to truncate a page's line pointer array when it has trailing unused ItemIds, and clobber free page space in PageRepairFragmentation.

Tang sent in another revision of a patch to support tab completion with a query result for upper character inputs in psql.

Fujii Masao sent in another revision of a patch to fix an assertion failure in walreciever.

John Naylor sent in another revision of a patch to replace pg_utf8_verifystr() with two faster implementations: one for Intel-ish processors that uses the SSE-4.1 instruction set, the other which uses a bespoke fallback function rather than one that relies on pg_utf8_verifychar() and pg_utf8_isvalid().

Peter Eisentraut sent in another revision of a patch to change the return type of EXTRACT to numeric.

Stephen Frost sent in a patch to add pg_read_all_data and pg_write_all_data roles.

Thomas Munro sent in a patch to use POSIX_NAMED_SEMAPHORES on OpenBSD.

Fujii Masao and Bharath Rupireddy traded patches to add a postgres_fdw server level option, keep_connections to not cache connection.

Heikki Linnakangas sent in a patch to simplify COPY FROM parsing by forcing lookahead.

Daniel Gustafsson sent in two more revisions of a patch to support NSS as a libpq TLS backend.

Yuzuko Hosoya and Álvaro Herrera traded patches to fix autovacuum on partitioned tables.

Bharath Rupireddy sent in a patch to emit a warning when a partitioned table's persistence is changed.

Amit Langote sent in another revision of a patch to create foreign key triggers in partitioned tables too, and use same to enforce foreign key correctly during cross-partition updates.

Euler Taveira de Oliveira sent in another revision of a patch to refactor the parse_output_parameters function to use the struct PGOutputData that encapsulates all pgoutput options instead of using multiple parameters, and use same to add logical decoding message support to pgoutput.

Peter Eisentraut sent in another revision of a patch to implement the SQL-standard function body.

Justin Pryzby sent in another revision of a patch to implement CLUSTER of partitioned tables.

Amit Langote sent in two more revisions of a patch to export get_partition_for_tuple(), and use same to avoid using SPI for some RI checks.

Julien Rouhaud sent in three more revisions of a patch to move pg_stat_statements query jumbling to core, and use same to expose queryid in pg_stat_activity, log_line_prefix, and verbose explain.

Joel Jacobson sent in a patch to add a MotD function.

Bharath Rupireddy sent in another revision of a patch to implement ALTER SUBSCRIPTION ... ADD/DROP PUBLICATION ...

Erik Rijkers sent in another revision of a patch to fix an old confusing JSON example.

Kazutaka Onishi sent in six more revisions of a patch to implement TRUNCATE on foreign tables.

Thomas Munro sent in another revision of a patch to add a buffer mapping table for SLRUs, and make all SLRU buffer sizes configurable.

Takamichi Osumi sent in two more revisions of a patch to Safeguard for archive recovery not to miss data. This disables the server to start up when it detects WAL generated with wal_level=minimal during archive recovery. This should be done regardless of the value of EnableHotStandby, because we don't think the scenario to go through the period of wal_level=minimal happens. The motivation of this patch is to protect user ends up with getting replica that could miss data in standby mode and getting the server to miss data in recovery mode.

Amit Langote sent in another revision of a patch to set ForeignScanState.resultRelInfo lazily, and initialize result relation information lazily.

Justin Pryzby sent in a patch to make track_activity_query_size a STATS_COLLECTOR category, make sure log_autovacuum_min_duration is LOGGING_WHAT, make track_commit_timestamp REPLICATION_SENDING, and change force_parallel_mode to a DEVELOPER GUC, and remove it from sample the configuration.

Pavel Stěhule sent in another revision of a patch to implement schema variables.

Anton Voloshin sent in a patch to fix a typo in collationcmds.c.

Zhihong Yu sent in a patch to remove an unused variable from AttrDefaultFetch.

Amit Langote sent in another revision of a patch to allow batching of inserts during cross-partition updates.

Anton Voloshin sent in a patch to use repalloc() instead of palloc() in icu_convert_case(), as the structure in question might already have been palloc()ed.

Tom Lane sent in a patch to fix some adbin inconsistencies.

Kategóriák: Informatika

Ora2Pg v21.1 released

2021, április 2 - 02:00

Grenoble, France - April 1st, 2021

Ora2Pg

Version 21.1 of Ora2Pg, a free and reliable tool used to migrate an Oracle database to PostgreSQL, has been officially released and is publicly available for download.

This release fix several issues reported during the last six months and adds several new features and improvement.

  • Now that Orafce 3.15.0 has a definition for the REGEXP_* functions makes the translation optional to USE_ORAFCE directive.
  • Add set application name in connection to Oracle/MySql/PostgreSQL.
  • Add translation of REGEXP_COUNT() and change assessment cost.
  • Rewrite the way REGEXP_LIKE() is translated into regexp_match to support modifiers. This rewrite also fix default behavior between Oracle and PostgreSQL.
  • Replace DBMS_LOB.GETLENGTH() by PostgreSQL octet_length() function.
  • Add types correspondences for VARCHAR2 and NVARCHAR2 in DATA_TYPE configuration directive.
  • Add autodetection and support of geometry type, srid and dimension for ArcGis geometries.
  • Add conversion of default value in function parameters.
  • Add -u | --unit option to ora2pg_scanner to be able to set the migration cost unit value globally.
  • Replace DBMS_LOB.SUBSTR() by SUBSTR()
  • Remove TO_CLOB() it is useless, manual cast could be necessary.
  • Replace IS JSON validation clause in CHECK constraints by (CASE WHEN $1::json IS NULL THEN true ELSE true END) When the code is invalid an error is fired.
  • DISTINCT and UNIQUE are synonym on Oracle.

Backward compatibility changes:

  • Force a column to be bigint if this is an identity column. Thanks to MigOps.com for the patch.
  • Fix EMPTY_LOB_NULL, enable/disable was inverted, keeping default to enabled. Take care that in old ora2pg.conf it is disabled so it will break backward compatibility with old configuration.
  • Replace NO_LOB_LOCATOR with USE_LOB_LOCATOR and NO_BLOB_EXPORT with ENABLE_BLOB_EXPORT to avoid confusion with double negative variable. Backward compatibility is preserved with a warning.
  • SRID for SDO_GEOMETRY export is now taken from the value not forced from the metadata table.

For a complete list of change see https://github.com/darold/ora2pg/blob/master/changelog

Thanks to all contributors, they are all cited in the changelog file.

Links & Credits

I would like to thank all users who submitted patches and users who reported bugs and feature requests, they are all cited the ChangeLog file.

Ora2Pg is an open project. Any contribution to build a better tool is welcome. You just have to send your ideas, features requests or patches using the GitHub tools or directly to ora2pg@darold.net.

Links:

- Website: [https://www.ora2pg.com/](https://www.ora2pg.com/) - Download1: [https://github.com/darold/ora2pg/releases](https://github.com/darold/ora2pg/releases) - Download2: [https://sourceforge.net/projects/ora2pg/](https://sourceforge.net/projects/ora2pg/) - Development: [https://github.com/darold/ora2pg](https://github.com/darold/ora2pg) - Changelog: [https://github.com/darold/ora2pg/blob/master/changelog](https://github.com/darold/ora2pg/blob/master/changelog) - Documentation: [https://github.com/darold/ora2pg/blob/master/README](https://github.com/darold/ora2pg/blob/master/README) About Ora2Pg

Ora2Pg is an easy and reliable tool to migrate from Oracle to PostgreSQL. It is developed since 2001 and can export most of the Oracle objects into PostgreSQL compatible code.

Ora2Pg works on any platform and is available under the GPL v3 licence.

Docs, Download & Support at http://www.ora2pg.com/

Kategóriák: Informatika

PostgreSQL Global Temporary Tables extension v2.3 released

2021, április 2 - 02:00

Grenoble - April 2nd, 2021

PostgreSQL Global Temporary Tables

pgtt is a PostgreSQL extension to create, manage and use DB2 or Oracle-style Global Temporary Tables. Once created the use is just like with the other RDBMS.

pgtt v2.3 has been released, this is a maintenance release to fix the compatibility with PostgreSQL 10 and 11.

Complete list of changes is available here

Links & Credits

pgtt is an open project. Any contribution to build a better tool is welcome. You just have to send your ideas, features requests or patches using the GitHub tools or directly to gilles@darold.net.

Links :

About pgtt

Docs & Download at https://github.com/darold/pgtt/

Kategóriák: Informatika

Generate realistic test Data for PostgreSQL with SB Data Generator, a New Tool from Softbuilder

2021, április 1 - 02:00

Softbuilder announces the first release of SB Data Generator, a new GUI tool for generating and populating databases with a large volume of meaningful test data.

SB Data Generator is a simple and powerful GUI tool to generate and populate selected tables or entire databases with realistic test data for your applications. SB Data Generator reverses your database and displays tables and columns, so you can assign to them multiple data generator templates.

Here are some valuable features of SB Data Generator

  • Support generate test data for PostgreSQL and other databases

  • Full support of all essential data types

  • Generate millions of rows of realistic test data

  • Preview what data will be generated before performing the generation

  • Various built-in generators available

  • Ability to create or customize data generators

Learn more about SB Data Generator

About Softbuilder

Softbuilder is a software development company focused on the creation of innovative database tools. The Softbuilder team is composed of passionate and experienced database specialists with more than 15 years of experience in databases, design, development, and maintenance.

Our team of software engineers works hard to provide database professionals with powerful, easy-to-use and complete solution product which allow them to build high-quality software solutions while increasing productivity and reducing development times. Learn more

Kategóriák: Informatika

PostgreSQL Weekly News - March 28, 2021

2021, március 29 - 02:00
PostgreSQL Weekly News - March 28, 2021

Person of the week: https://postgresql.life/post/julien_riou/

PostgreSQL Product News

pspg 4.5.0 a pager designed for PostgreSQL, released. https://github.com/okbob/pspg/releases/tag/4.5.0

pgAdmin4 5.1, a web- and native GUI control center for PostgreSQL, released. https://www.pgadmin.org/docs/pgadmin4/5.1/release_notes_5_1.html

PostgreSQL Jobs for March

https://archives.postgresql.org/pgsql-jobs/2021-03/

PostgreSQL in the News

Planet PostgreSQL: https://planet.postgresql.org/

PostgreSQL Weekly News is brought to you this week by David Fetter

Submit news and announcements by Sunday at 3:00pm PST8PDT to david@fetter.org.

Applied Patches

Andrew Dunstan pushed:

Tom Lane pushed:

  • Make compression.sql regression test independent of default. This test will fail in "make installcheck" if the installation's default_toast_compression setting is not 'pglz'. Make it robust against that situation. Dilip Kumar Discussion: https://postgr.es/m/CAFiTN-t0w+Rc2U3S+y=7KWcLuOYNB5MfWeGdNa7+pg0UovVdcQ@mail.gmail.com https://git.postgresql.org/pg/commitdiff/fd1ac9a548966786cf7978e590be816c55936a50

  • Bring configure support for LZ4 up to snuff. It's not okay to just shove the pkg_config results right into our build flags, for a couple different reasons:

  • This fails to maintain the separation between CPPFLAGS and CFLAGS, as well as that between LDFLAGS and LIBS. (The CPPFLAGS angle is, I believe, the reason for warning messages reported when building with MacPorts' liblz4.) * If pkg_config emits anything other than -I/-D/-L/-l switches, it's highly unlikely that we want to absorb those. That'd be more likely to break the build than do anything helpful. (Even the -D case is questionable; but we're doing that for libxml2, so I kept it.) Also, it's not okay to skip doing an AC_CHECK_LIB probe, as evidenced by recent build failure on topminnow; that should have been caught at configure time. Model fixes for this on configure's libxml2 support. It appears that somebody overlooked an autoheader run, too. Discussion: https://postgr.es/m/20210119190720.GL8560@telsasoft.com https://git.postgresql.org/pg/commitdiff/4d399a6fbeb720b34d33441330910b7d853f703d

  • Fix assorted silliness in ATExecSetCompression(). It's not okay to scribble directly on a syscache entry. Nor to continue accessing said entry after releasing it. Also get rid of not-used local variables. Per valgrind testing. https://git.postgresql.org/pg/commitdiff/ac897c483485d3858ada23ca49650a0f2742a50f

  • Remove useless configure probe for <lz4/lz4.h>. This seems to have been just copied-and-pasted from some other header checks. But our C code is entirely unprepared to support such a header name, so it's only wasting cycles to look for it. If we did need to support it, some #ifdefs would be required. (A quick trawl at codesearch.debian.net finds some packages that reference lz4/lz4.h; but they use only that spelling, and appear to be intending to reference their own copy rather than a system-level installation of liblz4. There's no evidence of freestanding installations that require this spelling.) Discussion: https://postgr.es/m/457962.1616362509@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/2c75f8a612b207c7d36e5dc73317dc9ab6fb29d4

  • Mostly-cosmetic adjustments of TOAST-related macros. The authors of bbe0a81db hadn't quite got the idea that macros named like SOMETHING_4B_C were only meant for internal endianness-related details in postgres.h. Choose more legible names for macros that are intended to be used elsewhere. Rearrange postgres.h a bit to clarify the separation between those internal macros and ones intended for wider use. Also, avoid using the term "rawsize" for true decompressed size; we've used "extsize" for that, because "rawsize" generally denotes total Datum size including header. This choice seemed particularly unfortunate in tests that were comparing one of these meanings to the other. This patch includes a couple of not-purely-cosmetic changes: be sure that the shifts aligning compression methods are unsigned (not critical today, but will be when compression method 2 exists), and fix broken definition of VARATT_EXTERNAL_GET_COMPRESSION (now VARATT_EXTERNAL_GET_COMPRESS_METHOD), whose callers worked only accidentally. Discussion: https://postgr.es/m/574197.1616428079@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/aeb1631ed207cef2d80e20f79eb52c72f03bca7d

  • Short-circuit slice requests that are for more than the object's size. substring(), and perhaps other callers, isn't careful to pass a slice length that is no more than the datum's true size. Since toast_decompress_datum_slice's children will palloc the requested slice length, this can waste memory. Also, close study of the liblz4 documentation suggests that it is dependent on the caller to not ask for more than the correct amount of decompressed data; this squares with observed misbehavior with liblz4 1.8.3. Avoid these problems by switching to the normal full-decompression code path if the slice request is >= datum's decompressed size. Tom Lane and Dilip Kumar Discussion: https://postgr.es/m/507597.1616370729@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/063dd37ebc7644e8db6419565b50dca019e69e86

  • Avoid possible crash while finishing up a heap rewrite. end_heap_rewrite was not careful to ensure that the target relation is open at the smgr level before performing its final smgrimmedsync. In ordinary cases this is no problem, because it would have been opened earlier during the rewrite. However a crash can be reproduced by re-clustering an empty table with CLOBBER_CACHE_ALWAYS enabled. Although that exact scenario does not crash in v13, I think that's a chance result of unrelated planner changes, and the problem is likely still reachable with other test cases. The true proximate cause of this failure is commit c6b92041d, which replaced a call to heap_sync (which was careful about opening smgr) with a direct call to smgrimmedsync. Hence, back-patch to v13. Amul Sul, per report from Neha Sharma; cosmetic changes and test case by me. Discussion: https://postgr.es/m/CANiYTQsU7yMFpQYnv=BrcRVqK_3U3mtAzAsJCaqtzsDHfsUbdQ@mail.gmail.com https://git.postgresql.org/pg/commitdiff/9d523119fd38fd205cb9c8ea8e7cceeb54355818

  • Fix psql's \connect command some more. Jasen Betts reported yet another unintended side effect of commit 85c54287a: reconnecting with "\c service=whatever" did not have the expected results. The reason is that starting from the output of PQconndefaults() effectively allows environment variables (such as PGPORT) to override entries in the service file, whereas the normal priority is the other way around. Not using PQconndefaults at all would require yet a third main code path in do_connect's parameter setup, so I don't really want to fix it that way. But we can have the logic effectively ignore all the default values for just a couple more lines of code. This patch doesn't change the behavior for "\c -reuse-previous=on service=whatever". That remains significantly different from before 85c54287a, because many more parameters will be re-used, and thus not be possible for service entries to replace. But I think this is (mostly?) intentional. In any case, since libpq does not report where it got parameter values from, it's hard to do differently. Per bug #16936 from Jasen Betts. As with the previous patches, back-patch to all supported branches. (9.5 is unfortunately now out of support, so this won't get fixed there.) Discussion: https://postgr.es/m/16936-3f524322a53a29f0@postgresql.org https://git.postgresql.org/pg/commitdiff/ea80138545043c0cfcff8405b15626796f2695fe

Peter Geoghegan pushed:

  • Recycle nbtree pages deleted during same VACUUM. Maintain a simple array of metadata about pages that were deleted during nbtree VACUUM's current btvacuumscan() call. Use this metadata at the end of btvacuumscan() to attempt to place newly deleted pages in the FSM without further delay. It might not yet be safe to place any of the pages in the FSM by then (they may not be deemed recyclable), but we have little to lose and plenty to gain by trying. In practice there is a very good chance that this will work out when vacuuming larger indexes, where scanning the index naturally takes quite a while. This commit doesn't change the page recycling invariants; it merely improves the efficiency of page recycling within the confines of the existing design. Recycle safety is a part of nbtree's implementation of what Lanin & Shasha call "the drain technique". The design happens to use transaction IDs (they're stored in deleted pages), but that in itself doesn't align the cutoff for recycle safety to any of the XID-based cutoffs used by VACUUM (e.g., OldestXmin). All that matters is whether or not other backends might be able to observe various inconsistencies in the tree structure (that they cannot just detect and recover from by moving right). Recycle safety is purely a question of maintaining the consistency (or the apparent consistency) of a physical data structure. Note that running a simple serial test case involving a large range DELETE followed by a VACUUM VERBOSE will probably show that any newly deleted nbtree pages are not yet reusable/recyclable. This is expected in the absence of even one concurrent XID assignment. It is an old implementation restriction. In practice it's unlikely to be the thing that makes recycling remain unsafe, at least with larger indexes, where recycling newly deleted pages during the same VACUUM actually matters. An important high-level goal of this commit (as well as related recent commits e5d8a999 and 9f3665fb) is to make expensive deferred cleanup operations in index AMs rare in general. If index vacuuming frequently depends on the next VACUUM operation finishing off work that the current operation started, then the general behavior of index vacuuming is hard to predict. This is relevant to ongoing work that adds a vacuumlazy.c mechanism to skip index vacuuming in certain cases. Anything that makes the real world behavior of index vacuuming simpler and more linear will also make top-down modeling in vacuumlazy.c more robust. Author: Peter Geoghegan pg@bowt.ie Reviewed-By: Masahiko Sawada sawada.mshk@gmail.com Discussion: https://postgr.es/m/CAH2-Wzk76_P=67iUscb1UN44-gyZL-KgpsXbSxq_bdcMa7Q+wQ@mail.gmail.com https://git.postgresql.org/pg/commitdiff/9dd963ae2534e9614f0abeccaafbd39f1b93ff8a

  • nbtree VACUUM: Cope with buggy opclasses. Teach nbtree VACUUM to press on with vacuuming in the event of a page deletion attempt that fails to "re-find" a downlink for its child/target page. There is no good reason to treat this as an irrecoverable error. But there is a good reason not to: pressing on at this point removes any question of VACUUM not making progress solely due to misbehavior from user-defined operator class code. Discussion: https://postgr.es/m/CAH2-Wzma5G9CTtMjbrXTwOym+U=aWg-R7=-htySuztgoJLvZXg@mail.gmail.com https://git.postgresql.org/pg/commitdiff/5b861baa550a369e04bf67fbe83f3a5a8c742fb4

Michaël Paquier pushed:

  • Fix timeline assignment in checkpoints with 2PC transactions. Any transactions found as still prepared by a checkpoint have their state data read from the WAL records generated by PREPARE TRANSACTION before being moved into their new location within pg_twophase/. While reading such records, the WAL reader uses the callback read_local_xlog_page() to read a page, that is shared across various parts of the system. This callback, since 1148e22a, has introduced an update of ThisTimeLineID when reading a record while in recovery, which is potentially helpful in the context of cascading WAL senders. This update of ThisTimeLineID interacts badly with the checkpointer if a promotion happens while some 2PC data is read from its record, as, by changing ThisTimeLineID, any follow-up WAL records would be written to an timeline older than the promoted one. This results in consistency issues. For instance, a subsequent server restart would cause a failure in finding a valid checkpoint record, resulting in a PANIC, for instance. This commit changes the code reading the 2PC data to reset the timeline once the 2PC record has been read, to prevent messing up with the static state of the checkpointer. It would be tempting to do the same thing directly in read_local_xlog_page(). However, based on the discussion that has led to 1148e22a, users may rely on the updates of ThisTimeLineID when a WAL record page is read in recovery, so changing this callback could break some cases that are working currently. A TAP test reproducing the issue is added, relying on a PITR to precisely trigger a promotion with a prepared transaction still tracked. Per discussion with Heikki Linnakangas, Kyotaro Horiguchi, Fujii Masao and myself. Author: Soumyadeep Chakraborty, Jimmy Yih, Kevin Yeap Discussion: https://postgr.es/m/CAE-ML+_EjH_fzfq1F3RJ1=XaaNG=-Jz-i3JqkNhXiLAsM3z-Ew@mail.gmail.com Backpatch-through: 10 https://git.postgresql.org/pg/commitdiff/595b9cba2ab0cdd057e02d3c23f34a8bcfd90a2d

  • Simplify TAP tests of kerberos with expected log file contents. The TAP tests of kerberos rely on the logs generated by the backend to check various connection scenarios. In order to make sure that a given test does not overlap with the log contents generated by a previous test, the test suite relied on a logic with the logging collector and a rotation of the log files to ensure the uniqueness of the log generated with a wait phase. Parsing the log contents for expected patterns is a problem that has been solved in a simpler way by PostgresNode::issues_sql_like() where the log file is truncated before checking for the contents generated, with the backend sending its output to a log file given by pg_ctl instead. This commit switches the kerberos test suite to use such a method, removing any wait phase and simplifying the whole logic, resulting in less code. If a failure happens in the tests, the contents of the logs are still showed to the user at the moment of the failure thanks to like(), so this has no impact on debugging capabilities. I have bumped into this issue while reviewing a different patch set aiming at extending the kerberos test suite to check for multiple log patterns instead of one now. Author: Michael Paquier Reviewed-by: Stephen Frost, Bharath Rupireddy Discussion: https://postgr.es/m/YFXcq2vBTDGQVBNC@paquier.xyz https://git.postgresql.org/pg/commitdiff/11e1577a576fec6307aa0bfcde7333e63f907fa7

  • Fix new TAP test for 2PC transactions and PITRs on Windows. The test added by 595b9cb forgot that on Windows it is necessary to set up pg_hba.conf (see PostgresNode::set_replication_conf) with a specific entry or base backups fail. Any node that requires to support replication just needs to pass down allows_streaming at initialization. This updates the test to do so. Simplify things a bit while on it. Per buildfarm member fairywren. Any Windows hosts running this test would have failed, and I have reproduced the problem as well. Backpatch-through: 10 https://git.postgresql.org/pg/commitdiff/992d353a190c551db39bcab2dec0ecf14fbc7a40

  • Fix concurrency issues with WAL segment recycling on Windows. This commit is mostly a revert of aaa3aed, that switched the routine doing the internal renaming of recycled WAL segments to use on Windows a combination of CreateHardLinkA() plus unlink() instead of rename(). As reported by several users of Postgres 13, this is causing concurrency issues when manipulating WAL segments, mostly in the shape of the following error: LOG: could not rename file "pg_wal/000000XX000000YY000000ZZ": Permission denied This moves back to a logic where a single rename() (well, pgrename() for Windows) is used. This issue has proved to be hard to hit when I tested it, facing it only once with an archive_command that was not able to do its work, so it is environment-sensitive. The reporters of this issue have been able to confirm that the situation improved once we switched back to a single rename(). In order to check things, I have provided to the reporters a patched build based on 13.2 with aaa3aed reverted, to test if the error goes away, and an unpatched build of 13.2 to test if the error still showed up (just to make sure that I did not mess up my build process). Extra thanks to Fujii Masao for pointing out what looked like the culprit commit, and to all the reporters for taking the time to test what I have sent them. Reported-by: Andrus, Guy Burgess, Yaroslav Pashinsky, Thomas Trenz Reviewed-by: Tom Lane, Andres Freund Discussion: https://postgr.es/m/3861ff1e-0923-7838-e826-094cc9bef737@hot.ee Discussion: https://postgr.es/m/16874-c3eecd319e36a2bf@postgresql.org Discussion: https://postgr.es/m/095ccf8d-7f58-d928-427c-b17ace23cae6@burgess.co.nz Discussion: https://postgr.es/m/16927-67c570d968c99567%40postgresql.org Discussion: https://postgr.es/m/YFBcRbnBiPdGZvfW@paquier.xyz Backpatch-through: 13 https://git.postgresql.org/pg/commitdiff/909b449e00fc2f71e1a38569bbddbb6457d28485

  • Add per-index stats information in verbose logs of autovacuum. Once a relation's autovacuum is completed, the logs include more information about this relation state if the threshold of log_autovacuum_min_duration (or its relation option) is reached, with for example contents about the statistics of the VACUUM operation for the relation, WAL and system usage. This commit adds more information about the statistics of the relation's indexes, with one line of logs generated for each index. The index stats were already calculated, but not printed in the context of autovacuum yet. While on it, some refactoring is done to keep track of the index statistics directly within LVRelStats, simplifying some routines related to parallel VACUUMs. Author: Masahiko Sawada Reviewed-by: Michael Paquier, Euler Taveira Discussion: https://postgr.es/m/CAD21AoAy6SxHiTivh5yAPJSUE4S=QRPpSZUdafOSz0R+fRcM6Q@mail.gmail.com https://git.postgresql.org/pg/commitdiff/5aed6a1fc214913de9ac69c1717dc64a2483e16d

  • Reword slightly logs generated for index stats in autovacuum. Using "remain" is confusing, as it implies that the index file can shrink. Instead, use "in total". Per discussion with Peter Geoghegan. Discussion: https://postgr.es/m/CAH2-WzkYgHZzpGOwR14CScJsjaQpvJrEkEfkh_=wGhzLb=yVdQ@mail.gmail.com https://git.postgresql.org/pg/commitdiff/99dd75fb99baa9188971cf47779ed8d7a5e6eb29

  • Sanitize the term "combo CID" in code comments. Combo CIDs were referred in the code comments using different terms across various places of the code, so unify a bit the term used with what is currently in use in some of the READMEs. Author: "Hou, Zhijie" Discussion: https://postgr.es/m/1d42865c91404f46af4562532fdbea31@G08CNEXMBPEKD05.g08.fujitsu.local https://git.postgresql.org/pg/commitdiff/a1999a01bb56c5f5451116abe61b892b2eec5e49

Noah Misch pushed:

Fujii Masao pushed:

  • pgbench: Improve error-handling in \sleep command. This commit improves pgbench \sleep command so that it handles the following three cases more properly. (1) When only one argument was specified in \sleep command and it's not a number, previously pgbench reported a confusing error message like "unrecognized time unit, must be us, ms or s". This commit fixes this so that more proper error message like "invalid sleep time, must be an integer" is reported. (2) When two arguments were specified in \sleep command and the first argument was not a number, previously pgbench treated that argument as the sleep time 0. No error was reported in this case. This commit fixes this so that an error is thrown in this case. (3) When a variable was specified as the first argument in \sleep command and the variable stored non-digit value, previously pgbench treated that argument as the sleep time 0. No error was reported in this case. This commit fixes this so that an error is thrown in this case. Author: Kota Miyake Reviewed-by: Hayato Kuroda, Alvaro Herrera, Fujii Masao Discussion: https://postgr.es/m/23b254daf20cec4332a2d9168505dbc9@oss.nttdata.com https://git.postgresql.org/pg/commitdiff/8c6eda2d1c926be76baa79c28521275323bd26fd

  • pg_waldump: Fix bug in per-record statistics. pg_waldump --stats=record identifies a record by a combination of the RmgrId and the four bits of the xl_info field of the record. But XACT records use the first bit of those four bits for an optional flag variable, and the following three bits for the opcode to identify a record. So previously the same type of XACT record could have different four bits (three bits are the same but the first one bit is different), and which could cause pg_waldump --stats=record to show two lines of per-record statistics for the same XACT record. This is a bug. This commit changes pg_waldump --stats=record so that it processes only XACT record differently, i.e., filters the opcode out of xl_info and uses a combination of the RmgrId and those three bits as the identifier of a record, only for XACT record. For other records, the four bits of the xl_info field are still used. Back-patch to all supported branches. Author: Kyotaro Horiguchi Reviewed-by: Shinya Kato, Fujii Masao Discussion: https://postgr.es/m/2020100913412132258847@highgo.ca https://git.postgresql.org/pg/commitdiff/51893c8463501fc9a38e39cc097773dbdfb9db82

  • Change the type of WalReceiverWaitStart wait event from Client to IPC. Previously the type of this wait event was Client. But while this wait event is being reported, walreceiver process is waiting for the startup process to set initial data for streaming replication. It's not waiting for any activity on a socket connected to a user application or walsender. So this commit changes the type for WalReceiverWaitStart wait event to IPC. Author: Fujii Masao Reviewed-by: Kyotaro Horiguchi Discussion: https://postgr.es/m/cdacc27c-37ff-f1a4-20e2-ce19933abfcc@oss.nttdata.com https://git.postgresql.org/pg/commitdiff/1e3e8b51bda8ddd59984230f876f199c9ce3166a

  • Log when GetNewOidWithIndex() fails to find unused OID many times. GetNewOidWithIndex() generates a new OID one by one until it finds one not in the relation. If there are very long runs of consecutive existing OIDs, GetNewOidWithIndex() needs to iterate many times in the loop to find unused OID. Since TOAST table can have a large number of entries and there can be such long runs of OIDs, there is the case where it takes so many iterations to find new OID not in TOAST table. Furthermore if all (i.e., 2^32) OIDs are already used, GetNewOidWithIndex() enters something like busy loop and repeats the iterations until at least one OID is marked as unused. There are some reported troubles caused by a large number of iterations in GetNewOidWithIndex(). For example, when inserting a billion of records into the table, all the backends doing that insertion operation got hang with 100% CPU usage at some point. Previously there was no easy way to detect that GetNewOidWithIndex() failed to find unused OID many times. So, for example, gdb full backtrace of hanged backends needed to be taken, in order to investigate that trouble. This is inconvenient and may not be available in some production environments. To provide easy way for that, this commit makes GetNewOidWithIndex() log that it iterates more than GETNEWOID_LOG_THRESHOLD but have not yet found OID unused in the relation. Also this commit makes it repeat logging with exponentially increasing intervals until it iterates more than GETNEWOID_LOG_MAX_INTERVAL, and makes it finally repeat logging every GETNEWOID_LOG_MAX_INTERVAL unless an unused OID is found. Those macro variables are used not to fill up the server log with the similar messages. In the discusion at pgsql-hackers, there was another idea to report the lots of iterations in GetNewOidWithIndex() via wait event. But since GetNewOidWithIndex() traverses indexes to find unused OID and which will do I/O, acquire locks, etc, which will overwrite the wait event and reset it to nothing once done. So that idea doesn't work well, and we didn't adopt it. Author: Tomohiro Hiramitsu Reviewed-by: Tatsuhito Kasahara, Kyotaro Horiguchi, Tom Lane, Fujii Masao Discussion: https://postgr.es/m/16722-93043fb459a41073@postgresql.org https://git.postgresql.org/pg/commitdiff/7fbcee1b2d5f1012c67942126881bd492e95077e

  • Rename wait event WalrcvExit to WalReceiverExit. Commit de829ddf23 added wait event WalrcvExit. But its name is not consistent with other wait events like WalReceiverMain or WalReceiverWaitStart, etc. So this commit renames WalrcvExit to WalReceiverExit. Author: Fujii Masao Reviewed-by: Thomas Munro Discussion: https://postgr.es/m/cced9995-8fa2-7b22-9d91-3f22a2b8c23c@oss.nttdata.com https://git.postgresql.org/pg/commitdiff/84007043fc1b1be68dad5d0a78269347c12094b6

  • Improve connection denied error message during recovery. Previously when an archive recovery or a standby was starting and reached the consistent recovery state but hot_standby was configured to off, the error message when a client connectted was "the database system is starting up", which was needless confusing and not really all that accurate either. This commit improves the connection denied error message during recovery, as follows, so that the users immediately know that their servers are configured to deny those connections.

  • If hot_standby is disabled, the error message "the database system is not accepting connections" and the detail message "Hot standby mode is disabled." are output when clients connect while an archive recovery or a standby is running. - If hot_standby is enabled, the error message "the database system is not yet accepting connections" and the detail message "Consistent recovery state has not been yet reached." are output when clients connect until the consistent recovery state is reached and postmaster starts accepting read only connections. This commit doesn't change the connection denied error message of "the database system is starting up" during normal server startup and crash recovery. Because it's still suitable for those situations. Author: James Coleman Reviewed-by: Alvaro Herrera, Andres Freund, David Zhang, Tom Lane, Fujii Masao Discussion: https://postgr.es/m/CAAaqYe8h5ES_B=F_zDT+Nj9XU7YEwNhKhHA2RE4CFhAQ93hfig@mail.gmail.com https://git.postgresql.org/pg/commitdiff/df9384492b89aac370ab9d12eb89375aeb38a1d4

  • Fix bug in WAL replay of COMMIT_TS_SETTS record. Previously the WAL replay of COMMIT_TS_SETTS record called TransactionTreeSetCommitTsData() with the argument write_xlog=true, which generated and wrote new COMMIT_TS_SETTS record. This should not be acceptable because it's during recovery. This commit fixes the WAL replay of COMMIT_TS_SETTS record so that it calls TransactionTreeSetCommitTsData() with write_xlog=false and doesn't generate new WAL during recovery. Back-patch to all supported branches. Reported-by: lx zou zoulx1982@163.com Author: Fujii Masao Reviewed-by: Alvaro Herrera Discussion: https://postgr.es/m/16931-620d0f2fdc6108f1@postgresql.org https://git.postgresql.org/pg/commitdiff/438fc4a39c3905b7af88bb848bc5aeb1308a017d

Robert Haas pushed:

Tomáš Vondra pushed:

Bruce Momjian pushed:

Amit Kapila pushed:

Peter Eisentraut pushed:

Stephen Frost pushed:

Michael Meskes pushed:

Álvaro Herrera pushed:

Pending Patches

Kyotaro HORIGUCHI sent in another revision of a patch to make async replica wait for the lsn to be replayed.

Bharath Rupireddy sent in a patch to remove extra memset calls in BloomInitPage, GinInitPage, and SpGistInitPage.

Hou Zhijie sent in another revision of a patch to avoid CommandCounterIncrement in RI trigger when INSERT INTO referencing table.

Amul Sul sent in another revision of a patch to build the infrastructure that will support ALTER SYSTEM READ {ONLY | WRITE}.

Amit Langote sent in another revision of a patch to fix an infelicity between UPDATE ... RETURNING and moving tuples between partitions.

Greg Nancarrow sent in another revision of a patch to enable parallel INSERT and or SELECT for INSERT INTO.

Tang sent in another revision of a patch to support tab completion with a query result for upper case character inputs in psql.

Tom Lane sent in another revision of a patch to allow an alias to be attached directly to a JOIN ... USING per the SQL standard.

David Oksman sent in a patch to implement ALTER TABLE ... RENAME COLUMN IF EXISTS.

Andrei Zubkov sent in two revisions of a patch to add statement entry timestamps to pg_stat_statements.

Thomas Munro sent in another revision of a patch to add PSQL_WATCH_PAGER for psql's \watch command.

Thomas Munro sent in four more revisions of a patch to detect dropped connections while running queries.

Fujii Masao sent in a patch intended to fix a bug that manifested as Failed assertion on standby while shutdown by making the startup process call ShutdownRecoveryTransactionEnvironment() when it exits.

Peter Eisentraut sent in another revision of a patch to add a result_format_auto_binary_types setting.

Jan Wieck sent in three more revisions of a patch to fix pg_upgrade to preserve datdba.

Bertrand Drouvot sent in four more revisions of a patch to implement minimal logical decoding on standbys.

Pavel Stěhule sent in two more revisions of a patch to implement schema variables.

Marcus Wanner sent in two more revisions of a patch to add an xid argument to the filter_prepare callback for output plugins.

Amul Sul sent in two more revisions of a patch to add an RelationGetSmgr() inline function.

Peter Smith and Amit Kapila traded patches to add logical decoding of two-phase transactions.

Euler Taveira de Oliveira and Peter Eisentraut traded patches to add row filtering for logical replication.

Kyotaro HORIGUCHI sent in another revision of a patch to change the stats collector's temporary storage from files to shared memory.

Masahiro Ikeda and Fujii Masao traded patches to make the WAL receiver report WAL statistics.

Bruce Momjian and Julien Rouhaud traded patches to expose queryid in pg_stat_activity, log_line_prefix, and verbose explain.

Atsushi Torikoshi sent in four more revisions of a patch to add a function, pg_get_backend_memory_contexts(), which does what it says on the label.

Daniel Gustafsson sent in two more revisions of a patch to support NSS as a libpq TLS backend.

Michaël Paquier and Jeevan Chalke traded patches to log authenticated identity from all auth backends.

Stephen Frost sent in another revision of a patch to use a WaitLatch for vacuum/autovacuum sleeping.

Stephen Frost sent in three more revisions of a patch to add a documentation stub for the now obsolete recovery.conf.

Justin Pryzby sent in another revision of a patch to add an optional ACCESS METHOD to CREATE TABLE ... LIKE.

Takayuki Tsunakawa sent in two more revisions of a patch to speed up COPY FROM for the case of remote partitions.

Amit Langote sent in another revision of a patch to create foreign key triggers in partitioned tables, and use this to enforce foreign key correctly during cross-partition updates.

David Rowley sent sent in two more revisions of a patch to add a Result Cache executor node.

Li Japin sent in another revision of a patch to implement ALTER SUBSCRIPTION ... ADD/DROP PUBLICATION.

Tomáš Vondra sent in a patch to fix up an opclass storage type.

Fujii Masao sent in another revision of a patch to rename WalrcvExit wait_event to WalReceiverExit.

Andrey V. Lepikhov sent in another revision of a patch to implement global snapshots.

Atsushi Torikoshi sent in two more revisions of a patch to add plan type to pg_stat_statements.

Denis Hirn sent in a patch to allow multiple recursive self-references in WITH RECURSIVE.

Masahiro Ikeda and Fujii Masao traded patches to get pgstat to avoid writing on SIGQUIT.

Kyotaro HORIGUCHI sent in another revision of a patch to protect syscache from bloating with negative cache entries.

Hou Zhijie sent in another revision of a patch to add a nullif case for eval_const_expressions.

Mark Dilger and Robert Haas traded patches to add a pg_amcheck contrib application.

Daniel Gustafsson sent in another revision of a patch to refactor the SSL test harness to allow for multiple libraries.

Pavel Stěhule sent in two more revisions of a patch to add routine labels.

Thomas Munro sent in four more revisions of a patch to make all SLRU buffer sizes configurable.

Peter Geoghegan and Masahiko Sawada traded patches to centralize state for each VACUUM, break lazy_scan_heap() up into functions, remove the tupgone special case from vacuumlazy.c, and skip index vacuuming in some cases.

Kyotaro HORIGUCHI sent in another revision of a patch to implement in-place table persistence change and add a new command, ALTER TABLE ALL IN TABLESPACE SET LOGGED/UNLOGGED, to go with it.

Ashutosh Bapat sent in another revision of a patch to fix a memory leak in decoding speculative inserts with TOAST.

Ekaterina Sokolova sent in another revision of a patch to add extra statistics to explain for Nested Loop.

Pavel Borisov sent in two more revisions of a patch to implement covering SP-GiST indexes, i.e. support for INCLUDE columns.

Marcus Wanner add a concurrent_abort callback for the output plugin.

Joel Jacobson sent in another revision of a patch to add views pg_permissions and pg_ownerships.

Bharath Rupireddy sent in another revision of a patch to make the error messages while adding tables to publications a bit more informative and consistent.

Kyotaro HORIGUCHI sent in another revision of a intended to fix a bug that manifested as Walsender may fail to send wal to the end.

Jim Finnerty sent in another revision of a patch to add a capability to have 64-bit GUCs, use XID_FMT to format xids, and use ClogPageNumber in place of int for type safety.

Sven Klemm sent in a patch to allow CustomScan nodes to signal whether they support projection.

Andrew Dunstan and Nikita Glukhov traded patches to implement the JSON_TABLE part of SQL/JSON.

Andrew Dunstan and Nikita Glukhov traded patches to implement the functions part of SQL/JSON.

Amit Langote and Tom Lane traded patches to make updates in inheritance trees scale better by overhauling how updates compute new tuples, and revise how inherited update/delete are handled.

David Steele sent in two revisions of a patch to document the fact that backup labels may need to be opened in binary mode on Windows.

Cai Mengjuan sent in a patch to update walrcv->flushedUpto each time when requesting xlog streaming.

Andrew Dunstan sent in another revision of a patch to allow matching the whole DN from a client certificate.

Masahiro Ikeda sent in a patch to improve the performance of reporting wal stats.

Tomáš Vondra sent in a patch to show applied extended statistics in explain.

Noah Misch sent in another revision of a patch to add a public schema default ACL.

Lætitia Avrot sent in two revisions of a patch to make it possible to dump only functions using pg_dump.

Noah Misch sent in another revision of a patch to accept slightly-filled pages for tuples larger than fillfactor.

Álvaro Herrera sent in two more revisions of a patch to add tracing capability to libpq.

Kazutaka Onishi sent in another revision of a patch to make TRUNCATE on foreign tables work.

Andrew Dunstan sent in another revision of a patch to implement global temporary tables.

Yoan SULTAN sent in a patch to make it possible for pg_stat_statements to track the most recent statement.

David Rowley sent in another revision of a patch to get better results from valgrind leak tracking.

Kategóriák: Informatika

Translations of Code of Conduct Posted: Japanese, Hebrew, and Russian

2021, március 26 - 01:00

The PostgreSQL Code of Conduct Committee is pleased to announce that the Japanese translation of the PostgreSQL Community Code of Conduct has been updated, and is now current. The Committee thanks Tatsuo Ishii, Tetsuo Sakata, and Akira Kurosawa for their work in updating the translation.

We now have a Hebrew translation as well, with thanks to Michael Goldberg and Emil Shkolnik.

There is also a Russian translation, drafted by Anastasia Raspopina, Anastasia Lubennikova, and Alexander Lakhin, with additional edits offered by Viktor Yegorov and Valeria Kaplan. We appreciate their teamwork in creating this new translation.

The English policy is here:

https://www.postgresql.org/about/policies/coc/

The updated Japanese translation is here:

https://www.postgresql.org/about/policies/coc/ja/

The Hebrew translation is here:

https://www.postgresql.org/about/policies/coc/he/

The Russian translation is here:

https://www.postgresql.org/about/policies/coc/ru/

Kategóriák: Informatika

pgAdmin 4 v5.1 Released

2021, március 25 - 01:00

The pgAdmin Development Team are pleased to announce pgAdmin 4 version 5.1. This release of pgAdmin 4 includes 42 bug fixes and new features. For more details please see the release notes.

pgAdmin is the leading Open Source graphical management tool for PostgreSQL. For more information, please see the website.

Notable changes in this release include:

Features:
  • Zoom scaling options with keyboard shortcuts in runtime:

    This feature includes the zoom scaling (Zoom In, Zoom Out, Actual Size) options along with Enter/Exit Full-Screen mode. We have added the options in Menu as well as using keyboard shortcuts users can perform the desired operation.

  • Show the login roles that are members of a group role be shown when examining a group role.

  • Added '--replace' option in Import server to replace the list of servers with the newly imported one.
  • Make the container distribution a multi-arch build with x86_64 and Arm64 support.
  • Make 'Kerberos an optional feature in the Python wheel, to avoid the need to install MIT Kerberos on the system by default.
Bugs/Housekeeping:
  • Fixed an issue where hanging symlinks in a directory caused the select file dialog to break.
  • Make the 'Save Data Changes' icon to be more intuitive.
  • Ensure that cell content being auto-selected when editing the cell data.
  • Ensure that SQL formatter should not add extra tabs and format the SQL correctly.
  • Ensure that SQL formatter should not use tab size if 'Use spaces?' set to false.
  • Fixed encoding issue when database encoding set to SQL_ASCII and name of the column is in ASCII character.
  • Ensure that the user should be able to kill the session from Dashboard if the user has a 'pg_signal_backend' role.
  • Ensure PGADMIN_DEFAULT_EMAIL looks sane when initializing a container deployment.
  • Fixed an issue where the user is not able to change the connection in Query Tool when any SQL file is opened.
  • Fixed an issue where copy/paste rows in view data paste the wrong value for boolean type.
  • Ensure that toggle buttons are accessible by most screen readers.

Builds for Windows and macOS are available now, along with a Python Wheel, Docker Container, RPM, DEB Package, and source code tarball from the tarball area.

Kategóriák: Informatika

PostgreSQL Weekly News - March 21, 2021

2021, március 22 - 01:00
PostgreSQL Weekly News - March 21, 2021

Person of the week: https://postgresql.life/post/anastasia_lubennikova/

PostgreSQL Product News

pspg 4.4.0 a pager designed for PostgreSQL, released. https://github.com/okbob/pspg/releases/tag/4.4.0

pg_activity 2.1.2, a top-like application for PostgreSQL server activity monitoring, released. https://github.com/dalibo/pg_activity/releases/tag/v2.1.2

PostgreSQL Jobs for March

https://archives.postgresql.org/pgsql-jobs/2021-03/

PostgreSQL in the News

Planet PostgreSQL: https://planet.postgresql.org/

PostgreSQL Weekly News is brought to you this week by David Fetter

Submit news and announcements by Sunday at 3:00pm PST8PDT to david@fetter.org.

Applied Patches

Peter Geoghegan pushed:

Fujii Masao pushed:

  • Make archiver process an auxiliary process. This commit changes WAL archiver process so that it's treated as an auxiliary process and can use shared memory. This is an infrastructure patch required for upcoming shared-memory based stats collector patch series. These patch series basically need any processes including archiver that can report the statistics to access to shared memory. Since this patch itself is useful to simplify the code and when users monitor the status of archiver, it's committed separately in advance. This commit simplifies the code for WAL archiving. For example, previously backends need to signal to archiver via postmaster when they notify archiver that there are some WAL files to archive. On the other hand, this commit removes that signal to postmaster and enables backends to notify archier directly using shared latch. Also, as the side of this change, the information about archiver process becomes viewable at pg_stat_activity view. Author: Kyotaro Horiguchi Reviewed-by: Andres Freund, Álvaro Herrera, Julien Rouhaud, Tomas Vondra, Arthur Zakirov, Fujii Masao Discussion: https://postgr.es/m/20180629.173418.190173462.horiguchi.kyotaro@lab.ntt.co.jp https://git.postgresql.org/pg/commitdiff/d75288fb27b8fe0a926aaab7d75816f091ecdc27

  • Fix comments in postmaster.c. Commit 86c23a6eb2 changed the option to specify that postgres will stop all other server processes by sending the signal SIGSTOP, from -s to -T. But previously there were comments incorrectly explaining that SIGSTOP behavior is set by -s option. This commit fixes them. Author: Kyotaro Horiguchi Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/20210316.165141.1400441966284654043.horikyota.ntt@gmail.com https://git.postgresql.org/pg/commitdiff/fd31214075cc740e43edc71ca1c385c8c53047b7

Thomas Munro pushed:

Tom Lane pushed:

  • Work around issues in MinGW-64's setjmp/longjmp support. It's hard to avoid the conclusion that there is something wrong with setjmp/longjmp on MinGW-64, as we have seen failures come and go after entirely-unrelated-looking changes in our own code. Other projects such as Ruby have given up and started using gcc's setjmp/longjmp builtins on that platform; this patch just follows that lead. Note that this is a pretty fundamental ABI break for functions containining either setjmp or longjmp, so we can't really consider a back-patch. Per reports from Regina Obe and Heath Lord, as well as recent failures on buildfarm member walleye, and less-recent failures on fairywren. Juan José Santamaría Flecha Discussion: https://postgr.es/m/000401d716a0$1ed0fc70$5c72f550$@pcorp.us Discussion: https://postgr.es/m/CA+BEBhvHhM-Bn628pf-LsjqRh3Ang7qCSBG0Ga+7KwhGqrNUPw@mail.gmail.com Discussion: https://postgr.es/m/f1caef93-9640-022e-9211-bbe8755a56b0@2ndQuadrant.com https://git.postgresql.org/pg/commitdiff/146cb3889c3ccb3fce198fe7464a1296a9e107c3

  • Improve logging of bad parameter values in BIND messages. Since commit ba79cb5dc, values of bind parameters have been logged during errors in extended query mode. However, we only did that after we'd collected and converted all the parameter values, thus failing to offer any useful localization of invalid-parameter problems. Add a separate callback that's used during parameter collection, and have it print the parameter number, along with the input string if text input format is used. Justin Pryzby and Tom Lane Discussion: https://postgr.es/m/20210104170939.GH9712@telsasoft.com Discussion: https://postgr.es/m/CANfkH5k-6nNt-4cSv1vPB80nq2BZCzhFVR5O4VznYbsX0wZmow@mail.gmail.com https://git.postgresql.org/pg/commitdiff/1ea396362be1615e926ea69d666c770081a0d3ef

  • Avoid corner-case memory leak in SSL parameter processing. After reading the root cert list from the ssl_ca_file, immediately install it as client CA list of the new SSL context. That gives the SSL context ownership of the list, so that SSL_CTX_free will free it. This avoids a permanent memory leak if we fail further down in be_tls_init(), which could happen if bogus CRL data is offered. The leak could only amount to something if the CRL parameters get broken after server start (else we'd just quit) and then the server is SIGHUP'd many times without fixing the CRL data. That's rather unlikely perhaps, but it seems worth fixing, if only because the code is clearer this way. While we're here, add some comments about the memory management aspects of this logic. Noted by Jelte Fennema and independently by Andres Freund. Back-patch to v10; before commit de41869b6 it doesn't matter, since we'd not re-execute this code during SIGHUP. Discussion: https://postgr.es/m/16160-18367e56e9a28264@postgresql.org https://git.postgresql.org/pg/commitdiff/4b12ab18c9d0735d760bf7667b158707b06e5df8

  • Doc: improve discussion of variable substitution in PL/pgSQL. This was a bit disjointed, partly because of a not-well-considered decision to document SQL commands that don't return result rows as though they had nothing in common with commands that do. Rearrange so that we have one discussion of variable substitution that clearly applies to all types of SQL commands, and then handle the question of processing command output separately. Clarify that EXPLAIN, CREATE TABLE AS SELECT, and similar commands that incorporate an optimizable statement will act like optimizable statements for the purposes of variable substitution. Do a bunch of minor wordsmithing in the same area. David Johnston and Tom Lane, reviewed by Pavel Stehule and David Steele Discussion: https://postgr.es/m/CAKFQuwYvMKucM5fnZvHSo-ah4S=_n9gmKeu6EAo=_fTrohunqQ@mail.gmail.com https://git.postgresql.org/pg/commitdiff/c783e656d41816b0328cb4bff27f11b70200770e

  • Add end-to-end testing of pg_basebackup's tar-format output. The existing test script does run pg_basebackup with the -Ft option, but it makes no real attempt to verify the sanity of the results. We wouldn't know if the output is incompatible with standard "tar" programs, nor if the server fails to start from the restored output. Notably, this means that xlog.c's read_tablespace_map() is not being meaningfully tested, since that code is used only in the tar-format case. (We do have reasonable coverage of restoring from plain-format output, though it's over in src/test/recovery not here.) Hence, attempt to untar the output and start a server from it, rather just hoping it's OK. This test assumes that the local "tar" has the "-C directory" switch. Although that's not promised by POSIX, my research suggests that all non-extinct tar implementations have it. Should the buildfarm's opinion differ, we can complicate the test a bit to avoid requiring that. Possibly this should be back-patched, but I'm unsure about whether it could work on Windows before d66b23b03. https://git.postgresql.org/pg/commitdiff/081876d75ea15c3bd2ee5ba64a794fd8ea46d794

  • Prevent buffer overrun in read_tablespace_map(). Robert Foggia of Trustwave reported that read_tablespace_map() fails to prevent an overrun of its on-stack input buffer. Since the tablespace map file is presumed trustworthy, this does not seem like an interesting security vulnerability, but still we should fix it just in the name of robustness. While here, document that pg_basebackup's --tablespace-mapping option doesn't work with tar-format output, because it doesn't. To make it work, we'd have to modify the tablespace_map file within the tarball sent by the server, which might be possible but I'm not volunteering. (Less-painful solutions would require changing the basebackup protocol so that the source server could adjust the map. That's not very appetizing either.) https://git.postgresql.org/pg/commitdiff/a50e4fd028a1ece2b1a04df2c9ae6581783e9eef

  • Code review for server's handling of "tablespace map" files. While looking at Robert Foggia's report, I noticed a passel of other issues in the same area:

  • The scheme for backslash-quoting newlines in pathnames is just wrong; it will misbehave if the last ordinary character in a pathname is a backslash. I'm not sure why we're bothering to allow newlines in tablespace paths, but if we're going to do it we should do it without introducing other problems. Hence, backslashes themselves have to be backslashed too. * The author hadn't read the sscanf man page very carefully, because this code would drop any leading whitespace from the path. (I doubt that a tablespace path with leading whitespace could happen in practice; but if we're bothering to allow newlines in the path, it sure seems like leading whitespace is little less implausible.) Using sscanf for the task of finding the first space is overkill anyway. * While I'm not 100% sure what the rationale for escaping both \r and \n is, if the idea is to allow Windows newlines in the file then this code failed, because it'd throw an error if it saw \r followed by \n. * There's no cross-check for an incomplete final line in the map file, which would be a likely apparent symptom of the improper-escaping bug. On the generation end, aside from the escaping issue we have: * If needtblspcmapfile is true then do_pg_start_backup will pass back escaped strings in tablespaceinfo->path values, which no caller wants or is prepared to deal with. I'm not sure if there's a live bug from that, but it looks like there might be (given the dubious assumption that anyone actually has newlines in their tablespace paths). * It's not being very paranoid about the possibility of random stuff in the pg_tblspc directory. IMO we should ignore anything without an OID-like name. The escaping rule change doesn't seem back-patchable: it'll require doubling of backslashes in the tablespace_map file, which is basically a basebackup format change. The odds of that causing trouble are considerably more than the odds of the existing bug causing trouble. The rest of this seems somewhat unlikely to cause problems too, so no back-patch. https://git.postgresql.org/pg/commitdiff/8620a7f6dbdf978e57cdb9f42802a0418656d863

  • Doc: remove duplicated step in RLS example. Seems to have been a copy-and-paste mistake in 093129c9d. Per report from max1@inbox.ru. Discussion: https://postgr.es/m/161591740692.24273.4202054598867879464@wrigleys.postgresql.org https://git.postgresql.org/pg/commitdiff/70945649d734d16be22c3d1d90dd8c3d3c1e9d89

  • Fix misuse of foreach_delete_current(). Our coding convention requires this macro's result to be assigned back to the original List variable. In this usage, since the List could not become empty, there was no actual bug --- but some compilers warned about it. Oversight in be45be9c3. Discussion: https://postgr.es/m/35077b31-2d62-1e31-0e2e-ddb52d590b73@enterprisedb.com https://git.postgresql.org/pg/commitdiff/1d581ce7129d7a33cd4ad27f8f246abfa1fd2db9

  • Don't leak rd_statlist when a relcache entry is dropped. Although these lists are usually NIL, and even when not empty are unlikely to be large, constant relcache update traffic could eventually result in visible bloat of CacheMemoryContext. Found via valgrind testing. Back-patch to v10 where this field was added. Discussion: https://postgr.es/m/3816764.1616104288@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/28644fac10731e30e70b622986a6fbbeb5a5b2f9

  • Don't leak malloc'd error string in libpqrcv_check_conninfo(). We leaked the error report from PQconninfoParse, when there was one. It seems unlikely that real usage patterns would repeat the failure often enough to create serious bloat, but let's back-patch anyway to keep the code similar in all branches. Found via valgrind testing. Back-patch to v10 where this code was added. Discussion: https://postgr.es/m/3816764.1616104288@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/9bacdf9f536a3720976ae258238cb46c691ce9b2

  • Don't leak compiled regex(es) when an ispell cache entry is dropped. The text search cache mechanisms assume that we can clean up an invalidated dictionary cache entry simply by resetting the associated long-lived memory context. However, that does not work for ispell affixes that make use of regular expressions, because the regex library deals in plain old malloc. Hence, we leaked compiled regex(es) any time we dropped such a cache entry. That could quickly add up, since even a fairly trivial regex can use up tens of kB, and a large one can eat megabytes. Add a memory context callback to ensure that a regex gets freed when its owning cache entry is cleared. Found via valgrind testing. This problem is ancient, so back-patch to all supported branches. Discussion: https://postgr.es/m/3816764.1616104288@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/d303849b059c3c315e5a8d4239016f8328f3296c

  • Don't run RelationInitTableAccessMethod in a long-lived context. Some code paths in this function perform syscache lookups, which can lead to table accesses and possibly leakage of cruft into the caller's context. If said context is CacheMemoryContext, we eventually will have visible bloat. But fixing this is no harder than moving one memory context switch step. (The other callers don't have a problem.) Andres Freund and I independently found this via valgrind testing. Back-patch to v12 where this code was added. Discussion: https://postgr.es/m/20210317023101.anvejcfotwka6gaa@alap3.anarazel.de Discussion: https://postgr.es/m/3816764.1616104288@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/415ffdc2205e209b6a73fb42a3fdd6e57e16c7b2

  • Don't leak malloc'd strings when a GUC setting is rejected. Because guc.c prefers to keep all its string values in malloc'd not palloc'd storage, it has to be more careful than usual to avoid leaks. Error exits out of string GUC hook checks failed to clear the proposed value string, and error exits out of ProcessGUCArray() failed to clear the malloc'd results of ParseLongOption(). Found via valgrind testing. This problem is ancient, so back-patch to all supported branches. Discussion: https://postgr.es/m/3816764.1616104288@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/377b7a83007d277d32ef19f7c7590c8668d504cb

  • Blindly try to fix test script's tar invocation for MSYS. Buildfarm member fairywren doesn't like the test case I added in commit 081876d75. I'm guessing the reason is that I shouldn't be using a perl2host-ified path in the tar command line. https://git.postgresql.org/pg/commitdiff/27ab1981e7c9b8fcbcb143c5f6f706441a52bbc8

  • Avoid leaking memory in RestoreGUCState(), and improve comments. RestoreGUCState applied InitializeOneGUCOption to already-live GUC entries, causing any malloc'd subsidiary data to be forgotten. We do want the effect of resetting the GUC to its compiled-in default, and InitializeOneGUCOption seems like the best way to do that, so add code to free any existing subsidiary data beforehand. The interaction between can_skip_gucvar, SerializeGUCState, and RestoreGUCState is way more subtle than their opaque comments would suggest to an unwary reader. Rewrite and enlarge the comments to try to make it clearer what's happening. Remove a long-obsolete assertion in read_nondefault_variables: the behavior of set_config_option hasn't depended on IsInitProcessingMode since f5d9698a8 installed a better way of controlling it. Although this is fixing a clear memory leak, the leak is quite unlikely to involve any large amount of data, and it can only happen once in the lifetime of a worker process. So it seems unnecessary to take any risk of back-patching. Discussion: https://postgr.es/m/4105247.1616174862@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/f0c2a5bba6c566fad781802537eb17f2977702bc

  • Fix memory leak when rejecting bogus DH parameters. While back-patching e0e569e1d, I noted that there were some other places where we ought to be applying DH_free(); namely, where we load some DH parameters from a file and then reject them as not being sufficiently secure. While it seems really unlikely that anybody would hit these code paths in production, let alone do so repeatedly, let's fix it for consistency. Back-patch to v10 where this code was introduced. Discussion: https://postgr.es/m/16160-18367e56e9a28264@postgresql.org https://git.postgresql.org/pg/commitdiff/e835e89a0fd267871e7fbddc39ad00ee3d0cb55c

  • Fix up pg_dump's handling of per-attribute compression options. The approach used in commit bbe0a81db would've been disastrous for portability of dumps. Instead handle non-default compression options in separate ALTER TABLE commands. This reduces chatter for the common case where most columns are compressed the same way, and it makes it possible to restore the dump to a server that lacks any knowledge of per-attribute compression options (so long as you're willing to ignore syntax errors from the ALTER TABLE commands). There's a whole lot left to do to mop up after bbe0a81db, but I'm fast-tracking this part because we need to see if it's enough to make the buildfarm's cross-version-upgrade tests happy. Justin Pryzby and Tom Lane Discussion: https://postgr.es/m/20210119190720.GL8560@telsasoft.com https://git.postgresql.org/pg/commitdiff/aa25d1089ac00bbc3f97d2efe8f54c3d4beed5d1

  • Suppress various new compiler warnings. Compilers that don't understand that elog(ERROR) doesn't return issued warnings here. In the cases in libpq_pipeline.c, we were not exactly helping things by failing to mark pg_fatal() as noreturn. Per buildfarm. https://git.postgresql.org/pg/commitdiff/9fb9691a88ae8df9bc30e0f0f72de7c96e73e125

Álvaro Herrera pushed:

Michaël Paquier pushed:

Stephen Frost pushed:

Robert Haas pushed:

  • Fix a confusing amcheck corruption message. Don't complain about the last TOAST chunk number being different from what we expected if there are no TOAST chunks at all. In such a case, saying that the final chunk number is 0 is not really accurate, and the fact the value is missing from the TOAST table is reported separately anyway. Mark Dilger Discussion: http://postgr.es/m/AA5506CE-7D2A-42E4-A51D-358635E3722D@enterprisedb.com https://git.postgresql.org/pg/commitdiff/4078ce65a0f7197180a9be2c6460ea4bf909bd98

  • Allow configurable LZ4 TOAST compression. There is now a per-column COMPRESSION option which can be set to pglz (the default, and the only option in up until now) or lz4. Or, if you like, you can set the new default_toast_compression GUC to lz4, and then that will be the default for new table columns for which no value is specified. We don't have lz4 support in the PostgreSQL code, so to use lz4 compression, PostgreSQL must be built --with-lz4. In general, TOAST compression means compression of individual column values, not the whole tuple, and those values can either be compressed inline within the tuple or compressed and then stored externally in the TOAST table, so those properties also apply to this feature. Prior to this commit, a TOAST pointer has two unused bits as part of the va_extsize field, and a compessed datum has two unused bits as part of the va_rawsize field. These bits are unused because the length of a varlena is limited to 1GB; we now use them to indicate the compression type that was used. This means we only have bit space for 2 more built-in compresison types, but we could work around that problem, if necessary, by introducing a new vartag_external value for any further types we end up wanting to add. Hopefully, it won't be too important to offer a wide selection of algorithms here, since each one we add not only takes more coding but also adds a build dependency for every packager. Nevertheless, it seems worth doing at least this much, because LZ4 gets better compression than PGLZ with less CPU usage. It's possible for LZ4-compressed datums to leak into composite type values stored on disk, just as it is for PGLZ. It's also possible for LZ4-compressed attributes to be copied into a different table via SQL commands such as CREATE TABLE AS or INSERT .. SELECT. It would be expensive to force such values to be decompressed, so PostgreSQL has never done so. For the same reasons, we also don't force recompression of already-compressed values even if the target table prefers a different compression method than was used for the source data. These architectural decisions are perhaps arguable but revisiting them is well beyond the scope of what seemed possible to do as part of this project. However, it's relatively cheap to recompress as part of VACUUM FULL or CLUSTER, so this commit adjusts those commands to do so, if the configured compression method of the table happens not to match what was used for some column value stored therein. Dilip Kumar. The original patches on which this work was based were written by Ildus Kurbangaliev, and those were patches were based on even earlier work by Nikita Glukhov, but the design has since changed very substantially, since allow a potentially large number of compression methods that could be added and dropped on a running system proved too problematic given some of the architectural issues mentioned above; the choice of which specific compression method to add first is now different; and a lot of the code has been heavily refactored. More recently, Justin Przyby helped quite a bit with testing and reviewing and this version also includes some code contributions from him. Other design input and review from Tomas Vondra, Álvaro Herrera, Andres Freund, Oleg Bartunov, Alexander Korotkov, and me. Discussion: http://postgr.es/m/20170907194236.4cefce96%40wp.localdomain Discussion: http://postgr.es/m/CAFiTN-uUpX3ck%3DK0mLEk-G_kUQY%3DSNOTeqdaNRR9FMdQrHKebw%40mail.gmail.com https://git.postgresql.org/pg/commitdiff/bbe0a81db69bd10bd166907c3701492a29aca294

  • Fix use-after-ReleaseSysCache problem in ATExecAlterColumnType. Introduced by commit bbe0a81db69bd10bd166907c3701492a29aca294. Per buildfarm member prion. https://git.postgresql.org/pg/commitdiff/d00fbdc431192c3e429b3e91c43d364e5c7ba680

Amit Kapila pushed:

Peter Eisentraut pushed:

Andres Freund pushed:

Magnus Hagander pushed:

Tomáš Vondra pushed:

  • Remove temporary files after backend crash. After a crash of a backend using temporary files, the files used to be left behind, on the basis that it might be useful for debugging. But we don't have any reports of anyone actually doing that, and it means the disk usage may grow over time due to repeated backend failures (possibly even hitting ENOSPC). So this behavior is a bit unfortunate, and fixing it required either manual cleanup (deleting files, which is error-prone) or restart of the instance (i.e. service disruption). This implements automatic cleanup of temporary files, controled by a new GUC remove_temp_files_after_crash. By default the files are removed, but it can be disabled to restore the old behavior if needed. Author: Euler Taveira Reviewed-by: Tomas Vondra, Michael Paquier, Anastasia Lubennikova, Thomas Munro Discussion: https://postgr.es/m/CAH503wDKdYzyq7U-QJqGn%3DGm6XmoK%2B6_6xTJ-Yn5WSvoHLY1Ww%40mail.gmail.com https://git.postgresql.org/pg/commitdiff/cd91de0d17952b5763466cfa663e98318f26d357

  • Implement GROUP BY DISTINCT. With grouping sets, it's possible that some of the grouping sets are duplicate. This is especially common with CUBE and ROLLUP clauses. For example GROUP BY CUBE (a,b), CUBE (b,c) is equivalent to GROUP BY GROUPING SETS ( (a, b, c), (a, b, c), (a, b, c), (a, b), (a, b), (a, b), (a), (a), (a), (c, a), (c, a), (c, a), (c), (b, c), (b), () ) Some of the grouping sets are calculated multiple times, which is mostly unnecessary. This commit implements a new GROUP BY DISTINCT feature, as defined in the SQL standard, which eliminates the duplicate sets. Author: Vik Fearing Reviewed-by: Erik Rijkers, Georgios Kokolatos, Tomas Vondra Discussion: https://postgr.es/m/bf3805a8-d7d1-ae61-fece-761b7ff41ecc@postgresfriends.org https://git.postgresql.org/pg/commitdiff/be45be9c33a85e72cdaeb9967e9f6d2d00199e09

  • Fix TAP test for remove_temp_files_after_crash. The test included in cd91de0d17 had two simple flaws. Firstly, the number of rows was low and on some platforms (e.g. 32-bit) the sort did not require on-disk sort, so on those machines it was not testing the automatic removal. The test was however failing, because without any temporary files the base/pgsql_tmp directory was not even created. Fixed by increasing the rowcount to 5000, which should be high engough on any platform. Secondly, the test used a simple sleep to wait for the temporary file to be created. This is obviously problematic, because on slow machines (or with valgrind, CLOBBER_CACHE_ALWAYS etc.) it may take a while to create the temporary file. But we also want the tests run reasonably fast. Fixed by instead relying on a UNIQUE constraint, blocking the query that created the temporary file. Author: Euler Taveira Reviewed-by: Tomas Vondra Discussion: https://postgr.es/m/CAH503wDKdYzyq7U-QJqGn%3DGm6XmoK%2B6_6xTJ-Yn5WSvoHLY1Ww%40mail.gmail.com https://git.postgresql.org/pg/commitdiff/a16b2b960f0eec2fe367e86017b3c24ed688ba2b

  • Fix race condition in remove_temp_files_after_crash TAP test. The TAP test was written so that it was not waiting for the correct SQL command, but for output from the preceding one. This resulted in race conditions, allowing the commands to run in a different order, not block as expected and so on. This fixes it by inverting the order of commands where possible, so that observing the output guarantees the data was inserted properly, and waiting for a lock to appear in pg_locks. Discussion: https://postgr.es/m/CAH503wDKdYzyq7U-QJqGn%3DGm6XmoK%2B6_6xTJ-Yn5WSvoHLY1Ww%40mail.gmail.com https://git.postgresql.org/pg/commitdiff/e589c4890b05044a04207c2797e7c8af6693ea5f

  • Use lfirst_int in cmp_list_len_contents_asc. The function added in be45be9c33 is comparing integer lists (IntList) by length and contents, but there were two bugs. Firstly, it used intVal() to extract the value, but that's for Value nodes, not for extracting int values from IntList. Secondly, it called it directly on the ListCell, without doing lfirst(). So just do lfirst_int() instead. Interestingly enough, this did not cause any crashes on the buildfarm, but valgrind rightfully complained about it. Discussion: https://postgr.es/m/bf3805a8-d7d1-ae61-fece-761b7ff41ecc@postgresfriends.org https://git.postgresql.org/pg/commitdiff/b822ae13ea93c18326d58d47829bbc66d36fae5c

  • Use valid compression method in brin_form_tuple. When compressing the BRIN summary, we can't simply use the compression method from the indexed attribute. The summary may use a different data type, e.g. fixed-length attribute may have varlena summary, leading to compression failures. For the built-in BRIN opclasses this happens to work, because the summary uses the same data type as the attribute. When the data types match, we can inherit use the compression method specified for the attribute (it's copied into the index descriptor). Otherwise we don't have much choice and have to use the default one. Author: Tomas Vondra Reviewed-by: Justin Pryzby pryzby@telsasoft.com Discussion: https://postgr.es/m/e0367f27-392c-321a-7411-a58e1a7e4817%40enterprisedb.com https://git.postgresql.org/pg/commitdiff/882b2cdc08c4100e273f24742e2118be98708a07

Andrew Dunstan pushed:

Pending Patches

Erica Zhang sent in another revision of a patch to add some tests for pg_stat_statements compatibility verification under contrib.

Jürgen Purtz sent in another revision of a patch to change the JOIN tutorial to focus more on explicit joins.

Noah Misch sent in a patch to merge similar algorithms into roles_is_member_of().

David Rowley sent in another revision of a patch to cache PathTarget and RestrictInfo's volatility, allow estimate_num_groups() to pass back further details about the estimation, allow the users of simplehash.h to perform direct deletions, add a Result Cache executor node, and remove code duplication in nodeResultCache.c.

Vigneshwaran C sent in a patch to fail prepared transaction if transaction has locked system tables/user catalog tables.

Laurenz Albe sent in another revision of a patch to improve \e, \ef and \ev if the editor is quit without saving by erasing the query buffer rather than executing the previous statement.

Nathan Bossart sent in another revision of a patch to avoid creating archive status ".ready" files too early, and keep track of notified-ready-for-archive position through crashes.

Mike Insoo sent in a patch to check whether, when decoding a catalog modifying commit record, it’s part of the RUNNING_XACT xid’s processed @ the restart_lsn. If so, then add its xid & subxacts in the committed txns list in the logical decoding snapshot.

Justin Pryzby sent in three more revisions of a patch to make it possible to use different compression methods for FPI.

Jim Mlodgenski sent in another revision of a patch to add a parser hook and a test that uses it.

Julien Rouhaud sent in another revision of a patch to add a new OUTDATED filtering facility for REINDEX command, with a corresponding option for the reindexdb utility. When used, REINDEX will only process indexes that have an outdated dependency. For now, only dependency on collations are supported but we'll likely support other kind of dependency in the future.

Georgios Kokolatos and Michaël Paquier traded patches to make dbsize a bit more consistent.

Amit Kapila, Justin Pryzby, and Greg Nancarrow traded patches to make it possible to execute INSERT .... SELECT in parallel.

Bruce Momjian sent in two more revisions of a patch to implement key management.

Bharath Rupireddy sent in four more revisions of a patch to add a version of pg_terminate_backend() with wait and timeout.

Masahiro Ikeda sent in another revision of a patch to make the WAL receiver report WAL statistics.

Iwata Aya sent in four more revisions of a patch to implement a libpq debug log.

Amul Sul sent in two more revisions of a patch to add the infrastructure ALTER SYSTEM READ {ONLY | WRITE} will use.

Jacob Champion and Michaël Paquier traded patches to log authenticated identity from all auth backends.

Masahiro Ikeda sent in four revisions of a patch to make the stats collector shutdown without writing the statsfiles if the immediate shutdown is requested.

Mark Rofail and Andreas Karlsson traded patches to implement foreign key arrays.

Ajin Cherian, Peter Smith, and Takamichi Osumi traded patches to implement logical decoding of two-phase transactions.

Álvaro Herrera sent in three more revisions of a patch to implement ALTER TABLE .. DETACH PARTITION CONCURRENTLY.

Amit Langote sent in another revision of a patch to allow batching of inserts during cross-partition updates.

IIvan Panchenko sent in another revision of a patch to add a login trigger.

Peter Eisentraut sent in another revision of a patch to make it possible to get dynamic result sets from procedures. Declaring a cursor WITH RETURN in a procedure makes the cursor's data be returned as a result of the CALL invocation. The procedure needs to be declared with the DYNAMIC RESULT SETS attribute.

Amit Kapila sent in a doc patch to add a description of substream in pg_subscription.

Amit Kapila sent in a patch to prevent cancel/die interrupts while creating tablesync slot.

Bharath Rupireddy sent in another revision of a patch to refactor the code that implements REFRESH MATERIALIZED VIEW for comprehensibility.

Ranier Vilela sent in a patch to cover a corner case in renaming files on Windows.

Hou Zhijie sent in another revision of a patch to extend the safety check to support parallel insert into fk relations, and skip the CommandCounterIncrement.

Fujii Masao sent in two more revisions of a patch to improve pgbench's \sleep command.

Andres Freund and Kyotaro HORIGUCHI traded patches to make the stats collector use shared memory instead of files for temporary storage in the stats collector.

Masahiko Sawada sent in another revision of a patch to fix transactions involving multiple postgres foreign servers.

Bharath Rupireddy sent in another revision of a patch to mprove PID XXXX is not a PostgreSQL server process message. Instead of the message "is not a server process" just say "is not a backend process". This would make things clear that the given PID is not a backend process but it can be some other process's pid or some other postmaster child process's pid or an invalid pid.

Zeng Wenjing sent in another revision of a patch to implement global temporary tables.

Tomáš Vondra sent in another revision of a patch to implement index skip scans.

Atsushi Torikoshi sent in another revision of a patch to add memorycontext elog print.

John Naylor sent in two more revisions of a patch to fix bug in heap space management that was overly cautious about fillfactor. Previously, if there were leftover ununused line pointers in a page, a table with a low fill factor would not be available for new tuples that were large enough to exceed fillfactor, even though the page contained no tuples at all. This lead to unnecessary relation extension. Fix by allowing some slack space equal to 1/8 the maximum space that could be taken up by line pointers. This is somewhat arbitrary, but should allow many cases to succeed where they didn't before. Per report from Floris van Nee.

Pavel Stěhule sent in another revision of a patch to add routine labels to PL/pgsql.

Michaël Paquier sent in another revision of a patch to fix an infelicity between PITR and 2PC.

Bertrand Drouvot sent in another revision of a patch to implement minimal logical decoding on standbys.

Amit Langote sent in a patch to tweak handling of serialization failures during cascaded RI update/delete.

Peter Eisentraut sent in another revision of a patch to ibpq: Set Server Name Indication (SNI) for SSL connections By default, have libpq set the TLS extension "Server Name Indication" (SNI). This allows an SNI-aware SSL proxy to route connections. (This requires a proxy that is aware of the PostgreSQL protocol, not just any SSL proxy.) In the future, this could also allow the server to use different SSL certificates for different host specifications. (That would require new server functionality. This would be the client-side functionality for that.) Since SNI makes the host name appear in cleartext in the network traffic, this might be undesirable in some cases. Therefore, also add a libpq connection option "sslsni" to turn it off.

Amit Langote sent in another revision of a patch to mark fully grouped relations partitioned if input relation is, and allow setting parallel_workers on partitioned tables.

Ibrar Ahmed sent in another revision of a patch to make it possible for INSERT SELECT to use BulkInsertState and multi_insert.

John Naylor sent in another revision of a patch to add a 'noError' argument to encoding conversion functions, and use same to do COPY FROM encoding conversion/verification in larger chunks.

Thomas Munro sent in another revision of a patch to use prefetching for WALs.

Peter Geoghegan sent in a patch to add a way to ignore pgindent runs and similar when using git blame.

Peter Eisentraut sent in another revision of a patch to add a result_format_auto_binary_types setting.

Fujii Masao sent in another revision of a patch to change the type of WalReceiverWaitStart wait event from Client to IPC.

Marina Polyakova sent in a patch to reduce the time required to recover database from archive by allowing concurrent loading of WAL files.

Peter Eisentraut sent in another revision of a patch to implement SQL-standard function bodies.

Andrei Zubkov sent in a patch to correct deallocation events accounting in pg_stat_statements.

James Coleman sent in another revision of a patch to Improve standby connection denied error message. Currently when a standby is finished starting up but hot_standby is configured to off, the error message when a client connects is "the database system is starting up", which is needless confusing (and not really all that accurate either). Instead send a helpful error message so that the user immediately knows that their server is configured to deny these connections.

John Naylor sent in another revision of a patch to add a 'noError' argument to encoding conversion functions, then use same to replace pg_utf8_verifystr() with two faster implementations: one using SSE-4.1 instructions on Intel, the other using a bespoke fallback function, rather than one that relies on pg_utf8_verifychar() and pg_utf8_isvalid(). This one is loosely based on the fallback that is part of the simdjson library.

Peter Eisentraut sent in another revision of a patch to change the return type of EXTRACT to numeric.

James Coleman sent in another revision of a patch to hash lookup const arrays in OR'd ScalarArrayOps.

Amit Khandekar sent in another revision of a patch to speed up xor'ing of two gist index signatures for tsvectors by using 64-bit chunks instead of char values, and avoid function pointer dereferencing for the pg_popcount32/64() call.

Etsuro Fujita sent in another revision of a patch to implement asynchronous append on postgres_fdw nodes.

Konstantin Knizhnik sent in another revision of a patch to add two statistics GUCs to auto_explain: add_statistics_threshold, and add_statistics_suggest_only.

Gilles Darold sent in another revision of a patch to add hooks at the XactCommand level.

Tatsuo Ishii sent in three more revisions of a patch to make it possible to use COPY FREEZE in pgbench.

Julien Rouhaud sent in another revision of a patch to add queryId to the pg_catalog.pg_stat_activity view.

Amit Langote sent in another revision of a patch to export get_partition_for_tuple() to make it visible to things other than execPartition.c can use it, and use it elsewhere to avoid using SPI for some RI checks.

Peter Smith sent in a patch to fixed a bug where the code was de-referencing the hash entry returned by HASH_REMOVE in stream_cleanup_files.

Thomas Munro sent in another revision of a patch to support PSQL_WATCH_PAGER for psql's \watch command, and try sigwait() to make it work.

Michaël Paquier sent in a patch to clean up the Kerberos auth test.

Thomas Munro sent in another revision of a patch to provide recovery_init_sync_method=wal.

Gilles Darold and Erik Rijkers traded patches to add some new functions for counting and otherwise characterizing regexp matches.

David Fetter sent in another revision of a patch to implement popcount (now called bit_count).

Justin Pryzby sent in another revision of a patch to add a way to use different compression methods for FPI.

Matthias van de Meent sent in a patch to ensure that when [+] is an option for a psql metacommand, \? output mentions that it is, and not otherwise.

Jan Wieck and Tom Lane traded patches to ensure that pg_upgrade preserves datdba.

Konstantin Knizhnik sent in another revision of a patch to implement a built-in connection pooler.

Kategóriák: Informatika

PostgreSQL Community Code of Conduct Committee Annual Report for 2020

2021, március 16 - 01:00

The PostgreSQL Community Code of Conduct Committee Annual Report for 2020 is now available at:

https://www.postgresql.org/about/policies/coc/reports/2020/

All complaints, actions, and statistics are anonymized and aggregated to protect the identities of all parties involved.

You can contact the Committee with questions or comments by email: coc@postgresql.org

Regards,

Stacey Haysler, Chair

PostgreSQL Community Code of Conduct Committee

Kategóriák: Informatika

PostgreSQL Weekly News - March 14, 2021

2021, március 15 - 01:00
PostgreSQL Weekly News - March 14, 2021

Person of the week: https://postgresql.life/post/corey_huinker/

PostgreSQL Jobs for March

https://archives.postgresql.org/pgsql-jobs/2021-03/

PostgreSQL in the News

Planet PostgreSQL: https://planet.postgresql.org/

PostgreSQL Weekly News is brought to you this week by David Fetter

Submit news and announcements by Sunday at 3:00pm PST8PDT to david@fetter.org.

Applied Patches

Amit Kapila pushed:

Tom Lane pushed:

  • Further tweak memory management for regex DFAs. Coverity is still unhappy after commit 190c79884, and after looking closer I think it might be onto something. The callers of newdfa() typically drop out if v->err has been set nonzero, which newdfa() is faithfully doing if it fails. However, what if v->err was already nonzero before we entered newdfa()? Then newdfa() could succeed and the caller would promptly leak its result. I don't think this scenario can actually happen, but the predicate "v->err is always zero when newdfa() is called" seems difficult to be entirely sure of; there's a good deal of code that potentially could get that wrong. It seems better to adjust the callers to directly check for a null result instead of relying on ISERR() tests. This is slightly cheaper than the previous coding anyway. Lacking evidence that there's any real bug, no back-patch. https://git.postgresql.org/pg/commitdiff/6c20bdb2a279086777a3595ab00bcf14671fc5a1

  • Validate the OID argument of pg_import_system_collations(). "SELECT pg_import_system_collations(0)" caused an assertion failure. With a random nonzero argument --- or indeed with zero, in non-assert builds --- it would happily make pg_collation entries with garbage values of collnamespace. These are harmless as far as I can tell (unless maybe the OID happens to become used for a schema, later on?). In any case this isn't a security issue, since the function is superuser-only. But it seems like a gotcha for unwary DBAs, so let's add a check that the given OID belongs to some schema. Back-patch to v10 where this function was introduced. https://git.postgresql.org/pg/commitdiff/5c06abb9b97b69513a3998cccf89556e73052e02

  • Properly mark pg_stat_get_subscription() as returning a set. The initial catalog data for this function failed to set proretset or provide a prorows estimate. It accidentally worked anyway when invoked in the FROM clause, because the executor isn't too picky about this; but the planner didn't expect the function to return multiple rows, which could lead to bad plans. Also the function would fail if invoked in the SELECT list. We can't easily back-patch this fix, but fortunately the bug's consequences aren't awful in most cases. Getting this right is mainly an exercise in future-proofing. Discussion: https://postgr.es/m/1636062.1615141782@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/fed10d4eec79242688382d03ddca82007160ee6f

  • Complain if a function-in-FROM returns a set when it shouldn't. Throw a "function protocol violation" error if a function in FROM tries to return a set though it wasn't marked proretset. Although such cases work at the moment, it doesn't seem like something we want to guarantee will keep working. Besides, there are other negative consequences of not setting the proretset flag, such as potentially bad plans. No back-patch, since if there is any third-party code violating this expectation, people wouldn't appreciate us breaking it in a minor release. Discussion: https://postgr.es/m/1636062.1615141782@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/d4545dc19b8ea670bf62e06d22b0e4e6fcb45153

  • Doc: fix missing mention of procedure OUT parameters. Small oversight in commit 2453ea142. https://git.postgresql.org/pg/commitdiff/3ebc6d295705fec37dc8f57a4ece54b370f55f72

  • Doc: improve introductory information about procedures. Clarify the discussion in "User-Defined Procedures", by laying out the key differences between functions and procedures in a bulleted list. Notably, this avoids burying the lede about procedures being able to do transaction control. Make the back-link in the CREATE FUNCTION reference page more prominent, and add one in CREATE PROCEDURE. Per gripe from Guyren Howe. Thanks to David Johnston for discussion. Discussion: https://postgr.es/m/BYAPR03MB4903C53A8BB7EFF5EA289674A6949@BYAPR03MB4903.namprd03.prod.outlook.com https://git.postgresql.org/pg/commitdiff/227338b00d498d9e1c5705a1ab118585e5d57c87

  • Doc: get rid of <foreignphrase> tags. We italicized some, but not all, instances of "per se", "pro forma", and "ad hoc". These phrases are widespread in formal registers of English, so it"s debatable whether they even qualify as foreign. We could instead try to be more consistent in the use of <foreignphrase>, but that"s difficult to enforce, so let"s just remove the tags for those words. The one case that seems to deserve the tag is "voilà". Instead of keeping just one instance of the tag, change that to a more standard phrase. John Naylor Discussion: https://postgr.es/m/CAFBsxsHtWs_NsccAVgQ=tTUKkXHpHdkjZXtp_Cd9dGWyBDxfbQ@mail.gmail.com https://git.postgresql.org/pg/commitdiff/b12436340adf27aa3d334c92579e6662dd3090ce

  • Avoid creating duplicate cached plans for inherited FK constraints. When a foreign key constraint is applied to a partitioned table, each leaf partition inherits a similar FK constraint. We were processing all of those constraints independently, meaning that in large partitioning trees we'd build up large collections of cached FK-checking query plans. However, in all cases but one, the generated queries are actually identical for all members of the inheritance tree (because, in most cases, the query only mentions the topmost table of the other side of the FK relationship). So we can share a single cached plan among all the partitions, saving memory, not to mention time to build and maintain the cached plans. Keisuke Kuroda and Amit Langote Discussion: https://postgr.es/m/cab4b85d-9292-967d-adf2-be0d803c3e23@nttcom.co.jp_1 https://git.postgresql.org/pg/commitdiff/c3ffe34863688115dd7878f118f2a123bafd8a26

  • Re-simplify management of inStart in pqParseInput3's subroutines. Commit 92785dac2 copied some logic related to advancement of inStart from pqParseInput3 into getRowDescriptions and getAnotherTuple, because it wanted to allow user-defined row processor callbacks to potentially longjmp out of the library, and inStart would have to be updated before that happened to avoid an infinite loop. We later decided that that API was impossibly fragile and reverted it, but we didn't undo all of the related code changes, and this bit of messiness survived. Undo it now so that there's just one place in pqParseInput3's processing where inStart is advanced; this will simplify addition of better tracing support. getParamDescriptions had grown similar processing somewhere along the way (not in 92785dac2; I didn't track down just when), but it's actually buggy because its handling of corrupt-message cases seems to have been copied from the v2 logic where we lacked a known message length. The cases where we "goto not_enough_data" should not simply return EOF, because then we won't consume the message, potentially creating an infinite loop. That situation now represents a definitively corrupt message, and we should report it as such. Although no field reports of getParamDescriptions getting stuck in a loop have been seen, it seems appropriate to back-patch that fix. I chose to back-patch all of this to keep the logic looking more alike in supported branches. Discussion: https://postgr.es/m/2217283.1615411989@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/51c54bb603098416dc6f9d9d46a3d14861f8fc38

  • Forbid marking an identity column as nullable. GENERATED ALWAYS AS IDENTITY implies NOT NULL, but the code failed to complain if you overrode that with "GENERATED ALWAYS AS IDENTITY NULL". One might think the old behavior was a feature, but it was inconsistent because the outcome varied depending on the order of the clauses, so it seems to have been just an oversight. Per bug #16913 from Pavel Boev. Back-patch to v10 where identity columns were introduced. Vik Fearing (minor tweaks by me) Discussion: https://postgr.es/m/16913-3b5198410f67d8c6@postgresql.org https://git.postgresql.org/pg/commitdiff/f52c5d6749a61fc4e0908457c58f5069910d53a6

  • Fix race condition in psql \e's detection of file modification. psql's editing commands decide whether the user has edited the file by checking for change of modification timestamp. This is probably fine for a pre-existing file, but with a temporary file that is created within the command, it's possible for a fast typist to save-and-exit in less than the one-second granularity of stat(2) timestamps. On Windows FAT filesystems the granularity is even worse, 2 seconds, making the race a bit easier to hit. To fix, try to set the temp file's mod time to be two seconds ago. It's unlikely this would fail, but then again the race condition itself is unlikely, so just ignore any error. Also, we might as well check the file size as well as its mod time. While this is a difficult bug to hit, it still seems worth back-patching, to ensure that users' edits aren't lost. Laurenz Albe, per gripe from Jacob Champion; based on fix suggestions from Jacob and myself Discussion: https://postgr.es/m/0ba3f2a658bac6546d9934ab6ba63a805d46a49b.camel@cybertec.at https://git.postgresql.org/pg/commitdiff/48d67fd897918c72e7cdf703d794056b88ed5725

  • pg_amcheck: Keep trying to fix the tests. Fix another example of non-portable option ordering in the tests. Oversight in 24189277f. Mark Dilger Discussion: https://postgr.es/m/C37D28BA-3BA3-4776-B812-17F05F3472D8@enterprisedb.com https://git.postgresql.org/pg/commitdiff/9e294d0f34d6e3e4fecf6f190b48862988934cde

  • Doc: add note about how to run the pg_amcheck regression tests. It's not immediately obvious what you have to do to get "make installcheck" to work here, so document that along the same lines as we've used elsewhere. https://git.postgresql.org/pg/commitdiff/58f57490facdec78119e5bab84229dbdc1d5ac6a

Michaël Paquier pushed:

  • Remove support for SSL compression. PostgreSQL disabled compression as of e3bdb2d and the documentation recommends against using it since. Additionally, SSL compression has been disabled in OpenSSL since version 1.1.0, and was disabled in many distributions long before that. The most recent TLS version, TLSv1.3, disallows compression at the protocol level. This commit removes the feature itself, removing support for the libpq parameter sslcompression (parameter still listed for compatibility reasons with existing connection strings, just ignored), and removes the equivalent field in pg_stat_ssl and de facto PgBackendSSLStatus. Note that, on top of removing the ability to activate compression by configuration, compression is actively disabled in both frontend and backend to avoid overrides from local configurations. A TAP test is added for deprecated SSL parameters to check after backwards compatibility. Bump catalog version. Author: Daniel Gustafsson Reviewed-by: Peter Eisentraut, Magnus Hagander, Michael Paquier Discussion: https://postgr.es/m/7E384D48-11C5-441B-9EC3-F7DB1F8518F6@yesql.se https://git.postgresql.org/pg/commitdiff/f9264d1524baa19e4a0528f033681ef16f61b137

  • Add support for more progress reporting in COPY. The command (TO or FROM), its type (file, pipe, program or callback), and the number of tuples excluded by a WHERE clause in COPY FROM are added to the progress reporting already available. The column "lines_processed" is renamed to "tuples_processed" to disambiguate the meaning of this column in the cases of CSV and BINARY COPY and to be more consistent with the other catalog progress views. Bump catalog version, again. Author: Matthias van de Meent Reviewed-by: Michael Paquier, Justin Pryzby, Bharath Rupireddy, Josef Šimánek, Tomas Vondra Discussion: https://postgr.es/m/CAEze2WiOcgdH4aQA8NtZq-4dgvnJzp8PohdeKchPkhMY-jWZXA@mail.gmail.com https://git.postgresql.org/pg/commitdiff/9d2d45700928d49212fb7ed140feeaebe3a6014f

  • Switch back sslcompression to be a normal input field in libpq. Per buildfarm member crake, any servers including a postgres_fdw server with this option set would fail to do a pg_upgrade properly as the option got hidden in f9264d1 by becoming a debug option, making the restore of the FDW server fail. This changes back the option in libpq to be visible, but still inactive to fix this upgrade issue. Discussion: https://postgr.es/m/YEbq15JKJwIX+S6m@paquier.xyz https://git.postgresql.org/pg/commitdiff/096bbf7c934a4288c9e48a6ac8e91d8753ac1ccd

  • Revert changes for SSL compression in libpq. This partially reverts 096bbf7 and 9d2d457, undoing the libpq changes as it could cause breakages in distributions that share one single libpq version across multiple major versions of Postgres for extensions and applications linking to that. Note that the backend is unchanged here, and it still disables SSL compression while simplifying the underlying catalogs that tracked if compression was enabled or not for a SSL connection. Per discussion with Tom Lane and Daniel Gustafsson. Discussion: https://postgr.es/m/YEbq15JKJwIX+S6m@paquier.xyz https://git.postgresql.org/pg/commitdiff/0ba71107efeeccde9158f47118f95043afdca0bb

  • Move tablespace path re-creation from the makefiles to pg_regress. Moving this logic into pg_regress fixes a potential failure with parallel tests when pg_upgrade and the main regression test suite both trigger the makefile rule that cleaned up testtablespace/ under src/test/regress. Even if pg_upgrade was triggering this rule, it has no need to do so as it uses a different tablespace path. So if pg_upgrade triggered the makefile rule for the tablespace setup while the main regression test suite ran the tablespace cases, it would fail. 61be85a was a similar attempt at achieving that, but that broke cases where the regression tests require to run under an Administrator account, like with Appveyor. Reported-by: Andres Freund, Kyotaro Horiguchi Reviewed-by: Peter Eisentraut Discussion: https://postgr.es/m/20201209012911.uk4d6nxcnkp7ehrx@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/6c788d9f6aadb41d76a72d56149268371a7895ee

  • Set libcrypto callbacks for all connection threads in libpq. Based on an analysis of the OpenSSL code with Jacob, moving to EVP for the cryptohash computations makes necessary the setup of the libcrypto callbacks that were getting set only for SSL connections, but not for connections without SSL. Not setting the callbacks makes the use of threads potentially unsafe for connections calling cryptohashes during authentication, like MD5 or SCRAM, if a failure happens during a cryptohash computation. The logic setting the libssl and libcrypto states is then split into two parts, both using the same locking, with libcrypto being set up for SSL and non-SSL connections, while SSL connections set any libssl state afterwards as needed. Prior to this commit, only SSL connections would have set libcrypto callbacks that are necessary to ensure a proper thread locking when using multiple concurrent threads in libpq (ENABLE_THREAD_SAFETY). Note that this is only required for OpenSSL 1.0.2 and 1.0.1 (oldest version supported on HEAD), as 1.1.0 has its own internal locking and it has dropped support for CRYPTO_set_locking_callback(). Tests with up to 300 threads with OpenSSL 1.0.1 and 1.0.2, mixing SSL and non-SSL connection threads did not show any performance impact after some micro-benchmarking. pgbench can be used here with -C and a mostly-empty script (with one \set meta-command for example) to stress authentication requests, and we have mixed that with some custom programs for testing. Reported-by: Jacob Champion Author: Michael Paquier Reviewed-by: Jacob Champion Discussion: https://postgr.es/m/fd3ba610085f1ff54623478cf2f7adf5af193cbb.camel@vmware.com https://git.postgresql.org/pg/commitdiff/2c0cefcd18161549e9e8b103f46c0f65fca84d99

Fujii Masao pushed:

  • Track total amounts of times spent writing and syncing WAL data to disk. This commit adds new GUC track_wal_io_timing. When this is enabled, the total amounts of time XLogWrite writes and issue_xlog_fsync syncs WAL data to disk are counted in pg_stat_wal. This information would be useful to check how much WAL write and sync affect the performance. Enabling track_wal_io_timing will make the server query the operating system for the current time every time WAL is written or synced, which may cause significant overhead on some platforms. To avoid such additional overhead in the server with track_io_timing enabled, this commit introduces track_wal_io_timing as a separate parameter from track_io_timing. Note that WAL write and sync activity by walreceiver has not been tracked yet. This commit makes the server also track the numbers of times XLogWrite writes and issue_xlog_fsync syncs WAL data to disk, in pg_stat_wal, regardless of the setting of track_wal_io_timing. This counters can be used to calculate the WAL write and sync time per request, for example. Bump PGSTAT_FILE_FORMAT_ID. Bump catalog version. Author: Masahiro Ikeda Reviewed-By: Japin Li, Hayato Kuroda, Masahiko Sawada, David Johnston, Fujii Masao Discussion: https://postgr.es/m/0509ad67b585a5b86a83d445dfa75392@oss.nttdata.com https://git.postgresql.org/pg/commitdiff/ff99918c625a84c91e7391db9032112ec8653623

  • Force to send remaining WAL stats to the stats collector at walwriter exit. In walwriter's main loop, WAL stats message is only sent if enough time has passed since last one was sent to reach PGSTAT_STAT_INTERVAL msecs. This is necessary to avoid overloading to the stats collector. But this can cause recent WAL stats to be unsent when walwriter exits. To ensure that all the WAL stats are sent, this commit makes walwriter force to send remaining WAL stats to the collector when it exits because of shutdown request. Note that those remaining WAL stats can still be unsent when walwriter exits with non-zero exit code (e.g., FATAL error). This is OK because that walwriter exit leads to server crash and subsequent recovery discards all the stats. So there is no need to send remaining stats in that case. Author: Masahiro Ikeda Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/0509ad67b585a5b86a83d445dfa75392@oss.nttdata.com https://git.postgresql.org/pg/commitdiff/33394ee6f2433d3cc7785428a77cc9a813254df7

  • Send statistics collected during shutdown checkpoint to the stats collector. When shutdown is requested, checkpointer performs checkpoint or restartpoint, and updates the statistics, before it exits. But previously checkpointer didn't send those statistics to the stats collector. Shutdown checkpoint and restartpoint are treated as requested ones instead of scheduled ones, so the number of them are counted in pg_stat_bgwriter.checkpoints_req column. Author: Masahiro Ikeda Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/0509ad67b585a5b86a83d445dfa75392@oss.nttdata.com https://git.postgresql.org/pg/commitdiff/b82640df0062483431608b7e9e074255f03e6c02

Peter Eisentraut pushed:

Alexander Korotkov pushed:

Thomas Munro pushed:

Bruce Momjian pushed:

Peter Geoghegan pushed:

  • Don't consider newly inserted tuples in nbtree VACUUM. Remove the entire idea of "stale stats" within nbtree VACUUM (stop caring about stats involving the number of inserted tuples). Also remove the vacuum_cleanup_index_scale_factor GUC/param on the master branch (though just disable them on postgres 13). The vacuum_cleanup_index_scale_factor/stats interface made the nbtree AM partially responsible for deciding when pg_class.reltuples stats needed to be updated. This seems contrary to the spirit of the index AM API, though -- it is not actually necessary for an index AM's bulk delete and cleanup callbacks to provide accurate stats when it happens to be inconvenient. The core code owns that. (Index AMs have the authority to perform or not perform certain kinds of deferred cleanup based on their own considerations, such as page deletion and recycling, but that has little to do with pg_class.reltuples/num_index_tuples.) This issue was fairly harmless until the introduction of the autovacuum_vacuum_insert_threshold feature by commit b07642db, which had an undesirable interaction with the vacuum_cleanup_index_scale_factor mechanism: it made insert-driven autovacuums perform full index scans, even though there is no real benefit to doing so. This has been tied to a regression with an append-only insert benchmark [1]. Also have remaining cases that perform a full scan of an index during a cleanup-only nbtree VACUUM indicate that the final tuple count is only an estimate. This prevents vacuumlazy.c from setting the index's pg_class.reltuples in those cases (it will now only update pg_class when vacuumlazy.c had TIDs for nbtree to bulk delete). This arguably fixes an oversight in deduplication-related bugfix commit 48e12913. [1] https://smalldatum.blogspot.com/2021/01/insert-benchmark-postgres-is-still.html Author: Peter Geoghegan pg@bowt.ie Reviewed-By: Masahiko Sawada sawada.mshk@gmail.com Discussion: https://postgr.es/m/CAD21AoA4WHthN5uU6+WScZ7+J_RcEjmcuH94qcoUPuB42ShXzg@mail.gmail.com Backpatch: 13-, where autovacuum_vacuum_insert_threshold was added. https://git.postgresql.org/pg/commitdiff/9f3665fbfc34b963933e51778c7feaa8134ac885

  • VACUUM ANALYZE: Always update pg_class.reltuples. vacuumlazy.c sometimes fails to update pg_class entries for each index (to ensure that pg_class.reltuples is current), even though analyze.c assumed that that must have happened during VACUUM ANALYZE. There are at least a couple of reasons for this. For example, vacuumlazy.c could fail to update pg_class when the index AM indicated that its statistics are merely an estimate, per the contract for amvacuumcleanup() routines established by commit e57345975cf back in 2006. Stop assuming that pg_class must have been updated with accurate statistics within VACUUM ANALYZE -- update pg_class for indexes at the same time as the table relation in all cases. That way VACUUM ANALYZE will never fail to keep pg_class.reltuples reasonably accurate. The only downside of this approach (compared to the old approach) is that it might inaccurately set pg_class.reltuples for indexes whose heap relation ends up with the same inaccurate value anyway. This doesn't seem too bad. We already consistently called vac_update_relstats() (to update pg_class) for the heap/table relation twice during any VACUUM ANALYZE -- once in vacuumlazy.c, and once in analyze.c. We now make sure that we call vac_update_relstats() at least once (though often twice) for each index. This is follow up work to commit 9f3665fb, which dealt with issues in btvacuumcleanup(). Technically this fixes an unrelated issue, though. btvacuumcleanup() no longer provides an accurate num_index_tuples value following commit 9f3665fb (when there was no btbulkdelete() call during the VACUUM operation in question), but hashvacuumcleanup() has worked in the same way for many years now. Author: Peter Geoghegan pg@bowt.ie Reviewed-By: Masahiko Sawada sawada.mshk@gmail.com Discussion: https://postgr.es/m/CAH2-WzknxdComjhqo4SUxVFk_Q1171GJO2ZgHZ1Y6pion6u8rA@mail.gmail.com Backpatch: 13-, just like commit 9f3665fb. https://git.postgresql.org/pg/commitdiff/5f8727f5a679452f7bbdd6966a1586934dcaa84f

  • Doc: B-Tree only has one additional parameter. Oversight in commit 9f3665fb. Backpatch: 13-, just like commit 9f3665fb. https://git.postgresql.org/pg/commitdiff/3f0daeb02f8dd605f89de9aa2349137c09cc7fb4

  • Add back vacuum_cleanup_index_scale_factor parameter. Commit 9f3665fb removed the vacuum_cleanup_index_scale_factor storage parameter. However, that creates dump/reload hazards when moving across major versions. Add back the vacuum_cleanup_index_scale_factor parameter (though not the GUC of the same name) purely to avoid problems when using tools like pg_upgrade. The parameter remains disabled and undocumented. No backpatch to Postgres 13, since vacuum_cleanup_index_scale_factor was only disabled by REL_13_STABLE's version of master branch commit 9f3665fb in the first place -- the parameter already looks like this on REL_13_STABLE. Discussion: https://postgr.es/m/YEm/a3Ko3nKnBuVq@paquier.xyz https://git.postgresql.org/pg/commitdiff/effdd3f3b633e88feaa675377075f02ecc99aee4

  • Save a few cycles during nbtree VACUUM. Avoid calling RelationGetNumberOfBlocks() unnecessarily in the common case where there are no deleted but not yet recycled pages to recycle during a cleanup-only nbtree VACUUM operation. Follow-up to commit e5d8a999, which (among other things) taught the "skip full scan" nbtree VACUUM mechanism to only trigger a full index scan when the absolute number of deleted pages in the index is considered excessive. https://git.postgresql.org/pg/commitdiff/7bb97211a5589265f3f88183ae9353639ab184c6

  • Consolidate nbtree VACUUM metapage routines. Simplify _bt_vacuum_needs_cleanup() functions's signature (it only needs a single 'rel' argument now), and move it next to its sibling function in nbtpage.c. I believe that _bt_vacuum_needs_cleanup() was originally located in nbtree.c due to an include dependency issue. That's no longer an issue. Follow-up to commit 9f3665fb. https://git.postgresql.org/pg/commitdiff/02b5940dbea17d07a1dbcba3cbe113cc8b70f228

Robert Haas pushed:

Pending Patches

Etsuro Fujita sent in another revision of a patch to implement asynchronous append on postgres_fdw nodes.

Kyotaro HORIGUCHI sent in a patch to Run 011_crash_recovery.pl with wal_level=minimal, and make sure published XIDs are persistent.

Ibrar Ahmed sent in another revision of a patch to pg_rewind which fix the way the TLI is determined when server was just promoted by looking at the minRecoveryPointTLI in the control file in addition to the ThisTimeLineID on the checkpoint.

Ibrar Ahmed sent in another revision of a patch to surface popcount to SQL.

Kota Miyake sent in another revision of a patch to fix the way pgbench's \sleep command works by requiring that it take an integer argument.

Michael Banck sent in another revision of a patch to add a new PGC_ADMINSET guc context and pg_change_role_settings default role.

Andrey Borodin sent in another revision of a patch to add Sortsupport for sorting GiST build for gist_btree types.

Pavel Stěhule sent in a patch to add operators ? and ->> for type record, and functions record_keys and record_each_text.

Ajin Cherian sent in a patch to make the stream_prepare_cb optional, which allows plugins to not allow the enabling of streaming and two_phase at the same time in logical replication.

Ibrar Ahmed sent in another revision of a patch to evaluate expressions at planning time for two more cases.

Joel Jacobson sent in four more revisions of a patch to add views pg_permissions and pg_ownerships, which greatly simplify the queries establishing those things.

Ibrar Ahmed sent in another revision of a patch to implement system-versioned temporal tables.

Tomáš Vondra sent in two more revisions of a patch to implement BRIN multi-range indexes.

Tomáš Vondra and Stephen Frost traded patches to improve the logging of auto-vacuum and auto-analyze by including the I/O timing if track_io_timing is enabled. Also, for auto-analyze, add the read rate and the dirty rate, similar to how that information has historically been logged for auto-vacuum

Masahiro Ikeda and Fujii Masao traded patches to send stats for both walwriter and checkpointer when shutting down.

Dilip Kumar and Justin Pryzby traded patches to implement custom compression methods for tables.

Bharath Rupireddy sent in another revision of a patch to implement EXPLAIN [ANALYZE] REFRESH MATERIALIZED VIEW.

Greg Nancarrow and Hou Zhijie traded patches to implement parallel INSERT (INTO ... SELECT ...).

Ibrar Ahmed and Kazutaka Onishi traded patches to implement TRUNCATE on foreign tables.

Peter Smith and Ajin Cherian traded patches to implement logical decoding of two-phase transactions.

Tatsuo Ishii sent in two revisions of a patch to make it possible to use COPY FREEZE in pgbench.

Peter Smith sent in a patch To Give the tablesync worker an opportunity to see if it can exit immediately (because it has already caught-up) without it needing to process a message first before discovering that.

Yugo Nagata sent in another revision of a patch to implement incremental materialized views.

Arseny Sher sent in a patch to increase the vm cache used for index-only scans.

Peter Eisentraut sent in another revision of a patch to add support for writing CREATE FUNCTION and CREATE PROCEDURE statements for language SQL with a function body that conforms to the SQL standard and is portable to other implementations.

Heikki Linnakangas sent in another revision of a patch to move a few ResourceOwnerEnlarge() calls for safety and clarity, make resowners more easily extensible, and use a 64-bit murmur hash if available to make this go faster.

Tomáš Vondra sent in another revision of a patch to control the removal temporary files after crash with a new GUC, remove_temp_files_after_crash.

Jacob Champion sent in two more revisions of a patch to store the client's DN in port->peer_dn, and use this to log the authenticated identity from all auth backends.

Joel Jacobson sent in another revision of a patch to implement regexp_positions.

Justin Pryzby sent in a patch to make pg_dump use a cursor in getBlobs to mitigate huge memory use in the case of millions of large objects.

Tomáš Vondra sent in another revision of a patch to make GROUP BY more efficient.

Erica Zhang sent in two revisions of a patch to add some tests for pg_stat_statements compatibility verification under contrib.

Matthias van de Meent sent in three revisions of a patch to truncate a pages' line pointer array when it has trailing unused ItemIds. This will allow reuse of what is effectively free space for data as well as new line pointers, instead of keeping it reserved for line pointers only. An additional benefit is that the HasFreeLinePointers hint-bit optimization now doesn't hint for free line pointers at the end of the array, slightly increasing the specificity of where the free lines are; and saving us from needing to search to the end of the array if all other entries are already filled.

Magnus Hagander sent in another revision of a patch to add PROXY protocol support.

Floris Van Nee sent in another revision of a patch to allow inserting tuples into almost-empty pages.

Marcus Wanner sent in a patch to Add an xid argument to the filter_prepare callback for the output plugin.

Kyotaro HORIGUCHI and Fujii Masao traded patches to change the temporary storage used by the stats collector from files to shared memory.

Amul Sul and Ibrar Ahmed traded patches to put in some machinery to make it possible to implement ALTER SYSTEM READ {ONLY|WRITE}.

Masahiko Sawada sent in another revision of a patch to make autovacuum logs a little more informative.

Bharath Rupireddy sent in another revision of a patch to add new table AMs for multi- and single inserts, and use same for CTAS, REFRESH MATERIALIZED VIEW, and COPY.

Yuzuko Hosoya sent in another revision of a patch to release SPI plans for referential integrity with DISCARD ALL.

John Naylor sent in a patch to get rid of the <foreignphrase> tags in the docs.

John Naylor sent in a patch to make popcount xor try indirection at the buffer level.

Masahiko Sawada sent in a patch to remove an unneeded bsearch from vacuumlazy.c

Craig Ringer sent in another revision of a patch to pass the target LWLock* and tranche ID to LWLock tracepoints, add to the tracepoints in LWLock routines, and add SDT tracepoints for backend type and postmaster pid on startup.

Bharath Rupireddy sent in two revisions of a patch to make the error messages in check_publication_add_relation a bit more informative and consistent.

Iwata Aya, Álvaro Herrera, and Tom Lane traded patches to add tracing capability to libpq.

David Rowley sent in two more revisions of a patch to cache PathTarget and RestrictInfo's volatility, allow estimate_num_groups() to pass back further details about the estimation, allow users of simplehash.h to perform direct deletions, add a Result Cache executor node, and remove code duplication in nodeResultCache.c.

Nathan Bossart and Laurenz Albe traded patches to document ALTER ROLE ... SET ROLE.

Mark Dilger and Robert Haas traded patches to add a frontend utility program, pg_amcheck.

Peter Geoghegan sent in another revision of a patch to recycle pages deleted during same VACUUM.

Thomas Munro sent in another revision of a patch to track relation sizes in shared memory, provide a lock-free fast path for smgrnblocks(), and update fifo to lru to sweep a valid cache.

Thomas Munro sent in a patch to make ProcSendSignal() more efficient by using pg_procno instead of scanning the ProcArray and keeping track of the startup process.

Hou Zhijie sent in another revision of a patch to avoid CommandCounterIncrement in RI triggers when performing an INSERT INTO referencing table.

Álvaro Herrera sent in another revision of a patch to implement ALTER TABLE ... DETACH PARTITION CONCURRENTLY.

Álvaro Herrera sent in two more revisions of a patch to implement batch/pipelining support for libpq.

Fabien COELHO and Dean Rasheed traded patches to add a pseudo-random permutation function to pgbench.

Bruce Momjian sent in another revision of a patch to add key management.

Takayuki Tsunakawa sent in another revision of a patch to speed up COPY FROM into tables with foreign partitions.

Hao Wu sent in a patch intended to fix a bug where HotStandbyActive() returns true after the node is promoted.

Takayuki Tsunakawa sent in two revisions of a patch to fix an overflow when counting the number of buffers to invalidate.

Andrey Borodin sent in another revision of a patch to make all SLRU buffer sizes configurable.

Tom Lane sent in two revisions of a patch to clean up the documentation of variable usage in PL/pgsql.

Thomas Munro sent in another revision of a patch to run the checkpointer and bgworker in crash recovery, log buffer stats after crash recovery, and make it possible not to wait for the end-of-recovery checkpoint.

Justin Pryzby sent in two more revisions of a patch to allow alternate compression methods for wal_compression.

Mark Rofail and Justin Pryzby traded patches to implement foreign key arrays.

Pavel Stěhule sent in another revision of a patch to implement schema variables.

Michaël Paquier and Soumyadeep Chakraborty traded patches to fix an infelicity between PITR and 2PC.

Pavel Stěhule sent in two more revisions of a patch to add a #routine_label pragma to PL/pgsql.

Thomas Munro sent in a patch to add bsearch and unique templates to sort_template.h, supply sort/search specializations for some common scalar types, use qsort_oid() and friends in obvious places, supply specialized sort/search routines for ItemPtrData, use qsort_itemptr() and friends in various places, specialize the HeapTuple sort routine for ANALYZE by use "encoded" format, based on 48 bit integers that can be compared by subtraction, avoiding a branch, specialize the pagetable sort routines in tidbitmap.c, specialize some sort/search routines in nbtree code, and specialize the sort routine used by multixact.c.

Julien Rouhaud sent in another revision of a patch to move pg_stat_statements query jumbling to core, expose queryid in pg_stat_activity and log_line_prefix, and expose query identifier in verbose explain.

Dmitry Dolgov sent in another revision of a patch to introduce skip scans.

Julien Rouhaud sent in another revision of a patch to add a new OUTDATED filtering facility for REINDEX command, and add a corresponding --outdated option to reindexdb.

Alexander Lakhin sent in a patch to use microsoft_native_stat in pgwin32_open.

Thomas Munro sent in two more revisions of a patch to optionally use syncfs() for SyncDataDirectory() on Linux.

Kategóriák: Informatika

PostgreSQL Accepted for Google Summer of Code 2021!

2021, március 13 - 01:00

Greetings,

The PostgreSQL Project is excited to announce that we have been accepted to participate in the Google Summer of Code (GSoC) program for the 14th time!

The GSoC program focuses on introducing students to Open Source software development by awarding stipends to university students who complete a free and open-source software project during their break from university. Students will propose a project and work with mentors from the PostgreSQL Community to complete that project over the course of three months, from June until August.

In order for PostgreSQL’s GSoC 2021 program to be successful, we need both students and mentors. If you are interested in being a mentor, please contact me (Stephen Frost, sfrost@snowman.net).

Students should be sure to review the requirements at the GSoC website:

https://summerofcode.withgoogle.com/

After reviewing the guidelines and requirements, students should reach out through one of the community channels, which include IRC, Slack, and the mailing lists, information about which is available here:

https://www.postgresql.org/community/

Note that there is a dedicated gsoc2021-students channel on Slack, while on IRC the general #postgresql channel can be used. Regarding modifications to PostgreSQL core, the pgsql-hackers mailing list is best, and for sub-projects, please use their appropriate list. If you are unsure about how to make contact, feel free to contact me.

Students may also reach out directly to mentors of projects they are interested in. Note that a student may propose any project they wish that meets the GSoC guidelines - so don’t feel limited to choosing one from the list if you have a great idea for how to improve PostgreSQL or other Open Source Software projects in the PostgreSQL ecosystem.

The full GSoC 2021 timeline is available here:

https://summerofcode.withgoogle.com/how-it-works/#timeline

We invite students and mentors to join and work together on enhancing the PostgreSQL ecosystem.

For more information about working with the PostgreSQL project as part of GSoC, please visit our GSoC website:

https://wiki.postgresql.org/wiki/GSoC

and check out our GSoC 2021 ideas page here:

https://wiki.postgresql.org/wiki/GSoC_2021

Thanks!

Stephen

Kategóriák: Informatika

pg_activity 2.1.0 released

2021, március 9 - 01:00

pg_activity 2.1.0 has been released on March, 8th 2021 under the PostgreSQL license.

pg_activity is an interactive terminal application for PostgreSQL server activity monitoring.

Changes (since version 2.0.0):

  • Let libpq handle default values for connection options (hostname, port, database name and user name)
  • Set application_name='pg_activity' for client connections
  • Add a --hide-queries-in-logs option to hide pg_activity's queries from server logs
  • Try to reconnect indefinitely when connection is lost
  • Move SQL queries from Python code to individual SQL files
  • Truncate long database names on Python side
  • Do not display IDLE queries as None for old postgresql versions

Bug fixes (since version 2.0.0):

  • Handle absence of some fields in memory info on OSX (version 2.0.1)
  • Handle 'query' field possibly being empty when displaying processes (version 2.0.1)
  • Fix sorting logic when the duration field is None (version 2.0.3)
  • Update man page to mention <connection string> argument
  • Use yellow instead of orange for PAUSE for compatibility with limited terminals

https://github.com/dalibo/pg_activity/

Kategóriák: Informatika

PostgreSQL Weekly News - March 7, 2021

2021, március 8 - 01:00
PostgreSQL Weekly News - March 7, 2021 PostgreSQL Product News

parquet_s3_fdw 0.1, a foreign data wrapper for parquet files on S3, released. https://github.com/pgspider/parquet_s3_fdw/releases/tag/v0.1

PostgreSQL Jobs for March

https://archives.postgresql.org/pgsql-jobs/2021-03/

PostgreSQL in the News

Planet PostgreSQL: https://planet.postgresql.org/

PostgreSQL Weekly News is brought to you this week by David Fetter

Submit news and announcements by Sunday at 3:00pm PST8PDT to david@fetter.org.

Applied Patches

Thomas Munro pushed:

Amit Kapila pushed:

Andres Freund pushed:

  • Fix recovery test hang in 021_row_visibility.pl on windows. The psql processes were not explicitly killed (but would eventually exit due postgres shutting down). For some reason windows perl doesn't like that, resulting in errors like Warning: unable to close filehandle GEN20 properly: Bad file descriptor during global destruction. The test was introduced in d6734a897e3, so no backpatching necessary. https://git.postgresql.org/pg/commitdiff/1e6e40447115ca7b4749d7d117b81b016ee5e2c2

Tom Lane pushed:

  • Improve reporting for syntax errors in multi-line JSON data. Point to the specific line where the error was detected; the previous code tended to include several preceding lines as well. Avoid re-scanning the entire input to recompute which line that was. Simplify the logic a bit. Add test cases. Simon Riggs and Hamid Akhtar, reviewed by Daniel Gustafsson and myself Discussion: https://postgr.es/m/CANbhV-EPBnXm3MF_TTWBwwqgn1a1Ghmep9VHfqmNBQ8BT0f+_g@mail.gmail.com https://git.postgresql.org/pg/commitdiff/ffd3944ab9d481906137bc7d20f5325a2bd68acc

  • Fix semantics of regular expression back-references. POSIX defines the behavior of back-references thus: The back-reference expression '\n' shall match the same (possibly empty) string of characters as was matched by a subexpression enclosed between "(" and ")" preceding the '\n'. As far as I can see, the back-reference is supposed to consider only the data characters matched by the referenced subexpression. However, because our engine copies the NFA constructed from the referenced subexpression, it effectively enforces any constraints therein, too. As an example, '(^.)\1' ought to match 'xx', or any other string starting with two occurrences of the same character; but in our code it does not, and indeed can't match anything, because the '^' anchor constraint is included in the backref's copied NFA. If POSIX intended that, you'd think they'd mention it. Perl for one doesn't act that way, so it's hard to conclude that this isn't a bug. Fix by modifying the backref's NFA immediately after it's copied from the reference, replacing all constraint arcs by EMPTY arcs so that the constraints are treated as automatically satisfied. This still allows us to enforce matching rules that depend only on the data characters; for example, in '(^\d+).*\1' the NFA matching step will still know that the backref can only match strings of digits. Perhaps surprisingly, this change does not affect the results of any of a rather large corpus of real-world regexes. Nonetheless, I would not consider back-patching it, since it's a clear compatibility break. Patch by me, reviewed by Joel Jacobson Discussion: https://postgr.es/m/661609.1614560029@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/4aea704a5bfd4b5894a268499369ccab89940c9c

  • Improve performance of regular expression back-references. In some cases, at the time that we're doing an NFA-based precheck of whether a backref subexpression can match at a particular place in the string, we already know which substring the referenced subexpression matched. If so, we might as well forget about the NFA and just compare the substring; this is faster and it gives an exact rather than approximate answer. In general, this optimization can help while we are prechecking within the second child expression of a concat node, while the capture was within the first child expression; then the substring was saved during cdissect() of the first child and will be available to NFA checks done while cdissect() recurses into the second child. It can help quite a lot if the tree looks like concat / \ capture concat / \ expensive stuff backref as we will be able to avoid recursively dissecting the "expensive stuff" before discovering that the backref isn't satisfied with a particular midpoint that the lower concat node is testing. This doesn't help if the concat tree is left-deep, as the capture node won't get set soon enough (and it's hard to fix that without changing the engine's match behavior). Fortunately, right-deep concat trees are the common case. Patch by me, reviewed by Joel Jacobson Discussion: https://postgr.es/m/661609.1614560029@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/0c3405cf11a12da1a4278c6833f4d979fe06c866

  • Suppress unnecessary regex subre nodes in a couple more cases. This extends the changes made in commit cebc1d34e, teaching parseqatom() to generate fewer or cheaper subre nodes in some edge cases. The case of interest here is a quantified atom that is "messy" only because it has greediness opposite to what preceded it (whereas captures and backrefs are intrinsically messy). In this case we don't need an iteration node, since we don't care where the sub-matches of the quantifier are; and we might also not need a second concatenation node. This seems of only marginal real-world use according to my testing, but I wanted to get it in before wrapping up this series of regex performance fixes. Discussion: https://postgr.es/m/1340281.1613018383@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/4604f83fdfe030a2f1984159ede5270c1d825310

  • Mark default_transaction_read_only as GUC_REPORT. This allows clients to find out the setting at connection time without having to expend a query round trip to do so; which is helpful when trying to identify read/write servers. (One must also look at in_hot_standby, but that's already GUC_REPORT, cf bf8a662c9.) Modifying libpq to make use of this will come soon, but I felt it cleaner to push the server change separately. Haribabu Kommi, Greg Nancarrow, Vignesh C; reviewed at various times by Laurenz Albe, Takayuki Tsunakawa, Peter Smith. Discussion: https://postgr.es/m/CAF3+xM+8-ztOkaV9gHiJ3wfgENTq97QcjXQt+rbFQ6F7oNzt9A@mail.gmail.com https://git.postgresql.org/pg/commitdiff/d16f8c8e416d288bd4734ed5f14076b62ec8d153

  • Extend the abilities of libpq's target_session_attrs parameter. In addition to the existing options of "any" and "read-write", we now support "read-only", "primary", "standby", and "prefer-standby". "read-write" retains its previous meaning of "transactions are read-write by default", and "read-only" inverts that. The other three modes test specifically for hot-standby status, which is not quite the same thing. (Setting default_transaction_read_only on a primary server renders it read-only to this logic, but not a standby.) Furthermore, if talking to a v14 or later server, no extra network round trip is needed to detect the session's status; the GUC_REPORT variables delivered by the server are enough. When talking to an older server, a SHOW or SELECT query is issued to detect session read-only-ness or server hot-standby state, as needed. Haribabu Kommi, Greg Nancarrow, Vignesh C, Tom Lane; reviewed at various times by Laurenz Albe, Takayuki Tsunakawa, Peter Smith. Discussion: https://postgr.es/m/CAF3+xM+8-ztOkaV9gHiJ3wfgENTq97QcjXQt+rbFQ6F7oNzt9A@mail.gmail.com https://git.postgresql.org/pg/commitdiff/ee28cacf619f4d9c23af5a80e1171a5adae97381

  • Silence perlcritic warning in commit ee28cacf6. Per buildfarm; this fix is from Michael Paquier (vignesh C proposed nearly the same). Discussion: https://postgr.es/m/YD8IZ9OKfUf9X1eF@paquier.xyz https://git.postgresql.org/pg/commitdiff/d422a2a94b1e7a7bb25da9d5511fffff750c3f21

  • Make test_target_session_attrs more robust against connection failure. Feed the desired command to psql via "-c" not stdin, else Perl may complain that it can't push stdin to an already-failed psql process, as seen in intermittent buildfarm failures. Make some minor cosmetic improvements while at it. Before commit ee28cacf6 we had no tests here that expected failure to connect, so there seems no need for a back-patch. Discussion: https://postgr.es/m/CALDaNm2mo8YED=M2ZJKGf1U3F3mw6SaQuLXWCK8rZP6sECYcrA@mail.gmail.com https://git.postgresql.org/pg/commitdiff/3769e11a31831fc2f3bd4c4a24b4f45c352fb8fb

  • Add trim_array() function. This has been in the SQL spec since 2008. It's a pretty thin wrapper around the array slice functionality, but the spec says we should have it, so here it is. Vik Fearing, reviewed by Dian Fay Discussion: https://postgr.es/m/fc92ce17-9655-8ff1-c62a-4dc4c8ccd815@postgresfriends.org https://git.postgresql.org/pg/commitdiff/0a687c8f103d217ff1ca8c34a644b380d89bb0ad

  • Remove deprecated containment operators for contrib types. Since PG 8.2, @ and ~ have been deprecated aliases for the containment operators @> and <@. It seems like enough time has passed to actually remove them, so do so. This completes the project begun in commit 2f70fdb06. Note that in the core types, the relation to the preferred operator names was reversed from what it is in these contrib modules. The confusion that induced was a large part of the reason for deprecation. Justin Pryzby Discussion: https://postgr.es/m/20201027032511.GF9241@telsasoft.com https://git.postgresql.org/pg/commitdiff/112d411fbeb56afd18c117e20b524a86afc9aba5

  • Doc: remove obsolete entries in table of BRIN strategy numbers. See commit 591d282e8. Noted by Justin Pryzby Discussion: https://postgr.es/m/20201027032511.GF9241@telsasoft.com https://git.postgresql.org/pg/commitdiff/0ce4cd04da558178b0186057b721c50a00b7a945

  • Add binary I/O capability for cube datatype. We can adjust the not-yet-released cube--1.4--1.5.sql upgrade rather than making a whole new version. KaiGai Kohei Discussion: https://postgr.es/m/CAOP8fzZO4y60QPTK=RGDXeVeVHV9tLHKOsh7voUOoUouVCPV8A@mail.gmail.com https://git.postgresql.org/pg/commitdiff/1265a9c8f871cfa5aea4d607b1ab72b45b96dbb7

Michaël Paquier pushed:

Álvaro Herrera pushed:

Peter Geoghegan pushed:

  • Fix nbtree page deletion error messages. Adjust some "can't happen" error messages that assumed that the page deletion target page must be a half-dead page. This assumption was wrong in the case of an internal target page. Simply refer to these pages as the target page instead. Internal pages are never marked half-dead. There is exactly one half-dead page for each subtree undergoing deletion. The half-dead page is also the target subtree's leaf-level page. This has been the case since commit efada2b8, which totally overhauled nbtree page deletion. https://git.postgresql.org/pg/commitdiff/3d8d5787a358156edaa7782f0c88e231af974a01

  • nbtree page deletion: Add leaftopparent assertion. Add documenting assertion. This makes it easier to follow how we maintain the top parent link in target subtree's half-dead/leaf level page. https://git.postgresql.org/pg/commitdiff/5b2f2af3d9d57626b9052a05337e32ad1399093d

Peter Eisentraut pushed:

Heikki Linnakangas pushed:

Fujii Masao pushed:

Andrew Dunstan pushed:

Magnus Hagander pushed:

Pending Patches

Pavel Stěhule sent in another revision of a patch to implement schema variables.

Maxim Orlov sent in a patch to fix an infelicity between TLS and multiple hosts in libpq connect strings.

Thomas Munro sent in two more revisions of a patch to add a condition variable for walreceiver state, another for for recovery pause/resume, and poll the postmaster less frequently in recovery.

Ajin Cherian and Amit Kapila traded patches to add an option to enable two_phase commits via pg_create_logical_replication_slot.

Michael Banck sent in a patch to add a --create-only option to pg_dump/pg_dumpall.

Michaël Paquier sent in another revision of a patch to clarify the documentation of signal handling in the archiver.

Etsuro Fujita sent in another revision of a patch to implement asynchronous append on postgres_fdw nodes.

Peter Eisentraut sent in another revision of a patch to deal with macOS SIP in builds.

Laurenz Albe and Amit Langote traded patches to allow setting parallel_workers on partitioned tables.

Joel Jacobson sent in three revisions of a patch to implement regexp_positions(), which returns the starting and ending positions of each match.

Thomas Munro sent in two more revisions of a patch to provide a new optional GUC that can be used to check whether the client connection has gone away periodically while running very long queries.

Paul Förster sent in another revision of a patch to mention URIs and services in psql --help's output.

Peter Smith and Ajin Cherian traded patches to implement logical decoding of two-phase transactions.

Greg Nancarrow and Amit Kapila traded patches to enable parallel SELECT for "INSERT INTO ... SELECT ...", add a new boolean GUC, enable_parallel_dml, and a similar table option, parallel_dml_enabled.

Dian M Fay sent in two revisions of a patch to suppress explicit casts of text constants in postgres_fdw when the other side of a binary OpExpr is also text.

Dilip Kumar sent in three more revisions of a patch to add a compression method option for tables.

Julien Rouhaud sent in another revision of a patch to move query jumbling from pg_stat_statements into core, expose queryid in pg_stat_activity and log_line_prefix, and expose query identifier in EXPLAIN ... VERBOSE.

Bruce Momjian sent in a patch to fix up some GiST code comments.

Vik Fearing sent in another revision of a patch to implement GROUP BY DISTINCT, which elides duplicated GROUPING SETs.

Mark Dilger sent in another revision of a patch to add a contrib extension, pg_amcheck.

Pavel Borisov sent in another revision of a patch to make amcheck checking a UNIQUE constraint for btree indexes.

Jacob Champion sent in another revision of a patch to add an API to the table AM to accept a column projection list.

Joel Jacobson sent in two revisions of a patch to support empty ranges with bounds information.

Tomáš Vondra sent in three more revisions of a patch to implement BRIN multi-range indexes.

Peter Eisentraut sent in another revision of a patch to implement SQL-standard function body.

Thomas Munro sent in two more revisions of a patch to implement parallel full hash join.

Magnus Hagander sent in five revisions of a patch to add PROXY protocol support.

Vigneshwaran C sent in a patch to fix a Buildfarm failure in crake by separating a declaration from an assignment in Perl code.

Michaël Paquier sent in another revision of a patch to enable the libcrypto callbacks before checking whether SSL needs to be done.

Daniel Gustafsson sent in another revision of a patch to support NSS as a libpq TLS backend.

Gilles Darold sent in a patch to implement regexp_count, regexp_instr, regexp_substr and regexp_replace.

David Rowley sent in another revision of a patch to reduce the number of special cases to build contrib modules on Windows.

Amit Langote sent in another revision of a patch to make UPDATE and DELETE scale better on inheritance trees.

John Naylor sent in a patch to review some alternatives to the current popcount implementations.

Mark Rofail sent in another revision of a patch to implement foreign key arrays.

Pavel Stěhule sent in three more revisions of a patch to make psql able to use a pager for \watch.

Tom Lane sent in another revision of a patch to remove support for COPY FROM STDIN in protocol version 2.

Mark Dilger sent in two revisions of a patch to add a pg_amcheck contrib application.

Soumyadeep Chakraborty and Kyotaro HORIGUCHI traded patches to prevent the checkpointer from writing to older timeline.

Amit Kapila sent in another revision of a patch to track replication origin progress for rollbacks.

Masahiro Ikeda and Fujii Masao traded patches to add WAL write/fsync statistics to pg_stat_wal.

Justin Pryzby and Takayuki Tsunakawa traded patches to speed up COPY FROM on tables with foreign partitions.

Álvaro Herrera and Justin Pryzby traded patches to add batch/pipelining support for libpq.

Daniel Gustafsson sent in three more revisions of a patch to disallow SSL compression.

Andrey V. Lepikhov and Tom Lane traded patches to remove the 64K rangetable limit.

Robins Tharakan sent in a patch to fix pg_upgrade for large numbers of large objects.

Jacob Champion sent in a patch to rework the sslfiles Makefile target.

Kyotaro HORIGUCHI sent in another revision of a patch to move the callback call from ReadPageInternal to XLogReadRecord, move the page reader out of XLogReadRecord, remove globals readOff, readLen and readSegNo, and allow xlogreader to use different xlog blocksizes.

Andy Fan sent in a patch to adjust the cost model for the partition pruning case.

Kyotaro HORIGUCHI sent in another revision of a patch to make the End-Of-Recovery error less scary.

Heikki Linnakangas sent in another revision of a patch to force lookahead in COPY FROM parsing.

Daniel Gustafsson sent in another revision of a patch to remove deprecated parameters authtype and pqtty from libpq.

Amul Sul sent in another revision of a patch to implement wal prohibit state using a global barrier, and error or Assert before START_CRIT_SECTION for WAL writes.

Thomas Munro sent in a patch to make relfile tombstone files conditional on WAL level.

Amit Langote sent in another revision of a patch to make some cosmetic improvements to the partition pruning step generation code.

Thomas Munro sent in another revision of a patch to replace buffer I/O locks with condition variables.

Thomas Munro sent in two more revisions of a patch to add missing pthread_barrier_t, refactor the way thread portability is done in pgbench, improve pgbench's time measurement code, and synchronize its client threads.

Julien Rouhaud sent in another revision of a patch to fix various shared memory estimates.

Laurenz Albe sent in two more revisions of a patch to improve \e, \ef and \ev if the editor is quit without saving by retaining the current query buffer rather than executing the previous query.

Masahiko Sawada sent in two revisions of a patch to ensure that pg_stat_xxx_tables.n_mod_since_analyze is reset on TRUNCATE.

Tomáš Vondra sent in two more revisions of a patch to make it possible to collect extended statistics on expressions.

Hayato Kuroda sent in another revision of a patch to refactor ECPGconnect and allow IPv6 connections in same.

Kyotaro HORIGUCHI sent in another revision of a patch to delay checkpoint completion after truncate success.

Kota Miyake sent in a patch to remove some warts from pgbench's \sleep metacommand.

Hou Zhijie sent in a patch to avoid CCI in RI trigger when INSERTing an FK relation.

Ibrar Ahmed sent in another revision of a patch to fix tests broken by an earlier patch to make GROUP BY work more efficiently.

Kirk Jamison sent in another revision of a patch to implement tracing in libpq.

Kyotaro HORIGUCHI sent in another revision of a patch to change the stats collector from using files for temporary storage to shared memory.

Andres Freund sent in another revision of a patch to ensure that on Windows, PostgreSQL is only considered to be running as a service if stderr is invalid.

Thomas Munro sent in another revision of a patch to inject fault timing, and fix a race condition in parallel hash join batch cleanup.

Andrew Dunstan sent in a patch to allow matching the whole DN from a client certificate.

KaiGai Kohei and Tom Lane traded patches to add binary input/output handlers to contrib/cube.

Bharath Rupireddy sent in two more revisions of a patch to refactor the code for refreshing materialized views into a group of functions, and implement EXPLAIN/EXPLAIN ANALYZE REFRESH MATERIALIZED VIEW.

Justin Pryzby sent in a patch to bootstrap that makes it possible to convert a Typ to a List*, and use that to allow composite types in bootstrap.

Kyotaro HORIGUCHI sent in a patch to fix an intermittent failure in 011_crash_recovery.pl by adding a CHECKPOINT.

Joel Jacobson sent in a patch to create a pg_permissions view.

Joel Jacobson sent in a patch to create a pg_ownerships view.

Justin Pryzby sent in another revision of a patch to make it possible to run CLUSTER on partitioned indexes.

Bharath Rupireddy sent in another revision of a patch to add another flavor of pg_terminate_backend that allows people to specify wait and timeout.

Bharath Rupireddy sent in another revision of a patch to improve the "PID XXXX is not a PostgreSQL server process" message by splitting out the case where sending a signal is not allowed.

Li Japin sent in another revision of a patch to implement ALTER SUBSCRIPTION ... ADD/DROP PUBLICATION.

Joe Conway sent in a patch to fix a bug in has_column_privilege()" that got attnums and non-existent columns wrong.

Justin Pryzby sent in another revision of a patch to implement ALTER TABLE SET ACCESS METHOD, allow specifying the table acccess method of partitioned tables to be inherited by partitions, and implement lsyscache get_rel_relam() because a table's AM is now in play.

Kategóriák: Informatika

Parquet S3 FDW 0.1 was newly released

2021, március 3 - 01:00

We have just released version 0.1 of the Foreign Data Wrapper for Parquet file on Amazon S3. This release can work with PostgreSQL 13.

The code is based on parquet_fdw created by adjust GmbH. This FDW supports following features as same as the original parquet_fdw :

  • Support SELECT of parquet file on local file system.
  • Support to create a foreign table for multiple files by specifying file paths.

This FDW supports following features in addition to the original parquet_fdw :

  • Support SELECT of parquet file on Amazon S3.
  • Support MinIO access instead of Amazon S3.
  • Support to create a foreign table for multiple files in a directory by specifying a directory path.

This is developed by Toshiba Software Engineering & Technology Center. Please see the repository for details.

Source repository : https://github.com/pgspider/parquet_s3_fdw

Kategóriák: Informatika

PostgreSQL Weekly News - February 28, 2021

2021, március 1 - 01:00
PostgreSQL Weekly News - February 28, 2021

Database Lab 2.2.1, a tool for fast cloning of large PostgreSQL databases to build non-production environments, released: https://gitlab.com/postgres-ai/database-lab/-/releases

dbMigration .NET v13.4, a database migration and sync tool, released. https://fishcodelib.com/DBMigration.htm

Joe 0.9.0, a Slack chatbot that helps backend developers and DBAs troubleshoot and optimize PostgreSQL queries, releaesed. https://gitlab.com/postgres-ai/joe/-/releases#0.9.0

pgAdmin4 5.0, a web- and native GUI control center for PostgreSQL, released. https://www.pgadmin.org/docs/pgadmin4/5.0/release_notes_5_0.html

pgagroal 1.2.0, a high-performance protocol-native connection pool for PostgreSQL, released. https://agroal.github.io/pgagroal/release/announcement/2021/02/23/pgagroal-1.2.0.html

Person of the week: https://postgresql.life/post/takayuki_tsunakawa/

PostgreSQL Product News PostgreSQL Jobs for February

https://archives.postgresql.org/pgsql-jobs/2021-02/

PostgreSQL in the News

Planet PostgreSQL: https://planet.postgresql.org/

PostgreSQL Weekly News is brought to you this week by David Fetter

Submit news and announcements by Sunday at 3:00pm PST8PDT to david@fetter.org.

Applied Patches

Tom Lane pushed:

  • Fix invalid array access in trgm_regexp.c. Brown-paper-bag bug in 08c0d6ad6: I missed one place that needed to guard against RAINBOW arc colors. Remarkably, nothing noticed the invalid array access except buildfarm member thorntail. Thanks to Noah Misch for assistance with tracking this down. https://git.postgresql.org/pg/commitdiff/6ee479abfc27a18c37fe77140d16d3ac31f4ac31

  • Simplify memory management for regex DFAs a little. Coverity complained that functions in regexec.c might leak DFA storage. It's wrong, but this logic is confusing enough that it's not so surprising Coverity couldn't make sense of it. Rewrite in hopes of making it more legible to humans as well as machines. https://git.postgresql.org/pg/commitdiff/190c79884aae540c92f017701726ed69265e2dab

  • Suppress compiler warning in new regex match-all detection code. gcc 10 is smart enough to notice that control could reach this "hasmatch[depth]" assignment with depth < 0, but not smart enough to know that that would require a badly broken NFA graph. Change the assert() to a plain runtime test to shut it up. Per report from Andres Freund. Discussion: https://postgr.es/m/20210223173437.b3ywijygsy6q42gq@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/3db05e76f92846d4b54d7de251b0875cf1e23aa4

  • Allow complemented character class escapes within regex brackets. The complement-class escapes \D, \S, \W are now allowed within bracket expressions. There is no semantic difficulty with doing that, but the rather hokey macro-expansion-based implementation previously used here couldn't cope. Also, invent "word" as an allowed character class name, thus "\w" is now equivalent to "[[:word:]]" outside brackets, or "[:word:]" within brackets. POSIX allows such implementation-specific extensions, and the same name is used in e.g. bash. One surprising compatibility issue this raises is that constructs such as "[\w-_]" are now disallowed, as our documentation has always said they should be: character classes can't be endpoints of a range. Previously, because \w was just a macro for "[:alnum:]_", such a construct was read as "[[:alnum:]_-_]", so it was accepted so long as the character after "-" was numerically greater than or equal to "_". Some implementation cleanup along the way: * Remove the lexnest() hack, and in consequence clean up wordchrs() to not interact with the lexer. * Fix colorcomplement() to not be O(N^2) in the number of colors involved. * Get rid of useless-as-far-as-I-can-see calls of element() on single-character character element names in brackpart(). element() always maps these to the character itself, and things would be quite broken if it didn't --- should "[a]" match something different than "a" does? Besides, the shortcut path in brackpart() wasn't doing this anyway, making it even more inconsistent. Discussion: https://postgr.es/m/2845172.1613674385@sss.pgh.pa.us Discussion: https://postgr.es/m/3220564.1613859619@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/2a0af7fe460eb46f9af996075972bf7c2e3f211d

  • Change regex \D and \W shorthands to always match newlines. Newline is certainly not a digit, nor a word character, so it is sensible that it should match these complemented character classes. Previously, \D and \W acted that way by default, but in newline-sensitive mode ('n' or 'p' flag) they did not match newlines. This behavior was previously forced because explicit complemented character classes don't match newlines in newline-sensitive mode; but as of the previous commit that implementation constraint no longer exists. It seems useful to change this because the primary real-world use for newline-sensitive mode seems to be to match the default behavior of other regex engines such as Perl and Javascript ... and their default behavior is that these match newlines. The old behavior can be kept by writing an explicit complemented character class, i.e. [^[:digit:]] or [^[:word:]]. (This means that \D and \W are not exactly equivalent to those strings, but they weren't anyway.) Discussion: https://postgr.es/m/3220564.1613859619@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/7dc13a0f0805a353cea0455ed95701322b39d4dd

  • Doc: remove src/backend/regex/re_syntax.n. We aren't publishing this file as documentation, and it's been much more haphazardly maintained than the real docs in func.sgml, so let's just drop it. I think the only reason I included it in commit 7bcc6d98f was that the Berkeley-era sources had had a man page in this directory. Discussion: https://postgr.es/m/4099447.1614186542@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/301ed8812e3f8b102b85e1f5a59e95990ed9a868

  • Fix list-manipulation bug in WITH RECURSIVE processing. makeDependencyGraphWalker and checkWellFormedRecursionWalker thought they could hold onto a pointer to a list's first cons cell while the list was modified by recursive calls. That was okay when the cons cell was actually separately palloc'd ... but since commit 1cff1b95a, it's quite unsafe, leading to core dumps or incorrect complaints of faulty WITH nesting. In the field this'd require at least a seven-deep WITH nest to cause an issue, but enabling DEBUG_LIST_MEMORY_USAGE allows the bug to be seen with lesser nesting depths. Per bug #16801 from Alexander Lakhin. Back-patch to v13. Michael Paquier and Tom Lane Discussion: https://postgr.es/m/16801-393c7922143eaa4d@postgresql.org https://git.postgresql.org/pg/commitdiff/80ca8464fe02296c8efefd53746e6d6a3f456d1e

  • Improve memory management in regex compiler. The previous logic here created a separate pool of arcs for each state, so that the out-arcs of each state were physically stored within it. Perhaps this choice was driven by trying to not include a "from" pointer within each arc; but Spencer gave up on that idea long ago, and it's hard to see what the value is now. The approach turns out to be fairly disastrous in terms of memory consumption, though. In the first place, NFAs built by this engine seem to have about 4 arcs per state on average, with a majority having only one or two out-arcs. So pre-allocating 10 out-arcs for each state is already cause for a factor of two or more bloat. Worse, the NFA optimization phase moves arcs around with abandon. In a large NFA, some of the states will have hundreds of out-arcs, so towards the end of the optimization phase we have a significant number of states whose arc pools have room for hundreds of arcs each, even though only a few of those arcs are in use. We have seen real-world regexes in which this effect bloats the memory requirement by 25X or even more. Hence, get rid of the per-state arc pools in favor of a single arc pool for the whole NFA, with variable-sized allocation batches instead of always asking for 10 at a time. While we're at it, let's batch the allocations of state structs too, to further reduce the malloc traffic. This incidentally allows moveouts() to be optimized in a similar way to moveins(): when moving an arc to another state, it's now valid to just re-link the same arc struct into a different outchain, where before the code invariants required us to make a physically new arc and then free the old one. These changes reduce the regex compiler's typical space consumption for average-size regexes by about a factor of two, and much more for large or complicated regexes. In a large test set of real-world regexes, we formerly had half a dozen cases that failed with "regular expression too complex" due to exceeding the REG_MAX_COMPILE_SPACE limit (about 150MB); we would have had to raise that limit to something close to 400MB to make them work with the old code. Now, none of those cases need more than 13MB to compile. Furthermore, the test set is about 10% faster overall due to less malloc traffic. Discussion: https://postgr.es/m/168861.1614298592@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/0fc1af174cf7113445e116feb2813405b838a47d

  • Doc: further clarify libpq's description of connection string URIs. Break the synopsis into named parts to make it less confusing. Make more than zero effort at applying SGML markup. Do a bit of copy-editing of nearby text. The synopsis revision is by Alvaro Herrera and Paul Förster, the rest is my fault. Back-patch to v10 where multi-host connection strings appeared. Discussion: https://postgr.es/m/6E752D6B-487C-463E-B6E2-C32E7FB007EA@gmail.com https://git.postgresql.org/pg/commitdiff/4e90052c46c7751779ed83627676ed5e74ebe6d4

Thomas Munro pushed:

Michaël Paquier pushed:

Peter Eisentraut pushed:

Fujii Masao pushed:

Magnus Hagander pushed:

  • Fix docs build for website styles. Building the docs with STYLE=website referenced a stylesheet that long longer exists on the website, since we changed it to use versioned references. To make it less likely for this to happen again, point to a single stylesheet on the website which will in turn import the required one. That puts the process entirely within the scope of the website repository, so next time a version is switched that's the only place changes have to be made, making them less likely to be missed. Per (off-list) discussion with Peter Geoghegan and Jonathan Katz. https://git.postgresql.org/pg/commitdiff/d22d0fa937616a3112f69ebd6a6ed4f039162441

Álvaro Herrera pushed:

Amit Kapila pushed:

Peter Geoghegan pushed:

  • Use full 64-bit XIDs in deleted nbtree pages. Otherwise we risk "leaking" deleted pages by making them non-recyclable indefinitely. Commit 6655a729 did the same thing for deleted pages in GiST indexes. That work was used as a starting point here. Stop storing an XID indicating the oldest bpto.xact across all deleted though unrecycled pages in nbtree metapages. There is no longer any reason to care about that condition/the oldest XID. It only ever made sense when wraparound was something _bt_vacuum_needs_cleanup() had to consider. The btm_oldest_btpo_xact metapage field has been repurposed and renamed. It is now btm_last_cleanup_num_delpages, which is used to remember how many non-recycled deleted pages remain from the last VACUUM (in practice its value is usually the precise number of pages that were _newly deleted_ during the specific VACUUM operation that last set the field). The general idea behind storing btm_last_cleanup_num_delpages is to use it to give _some_ consideration to non-recycled deleted pages inside _bt_vacuum_needs_cleanup() -- though never too much. We only really need to avoid leaving a truly excessive number of deleted pages in an unrecycled state forever. We only do this to cover certain narrow cases where no other factor makes VACUUM do a full scan, and yet the index continues to grow (and so actually misses out on recycling existing deleted pages). These metapage changes result in a clear user-visible benefit: We no longer trigger full index scans during VACUUM operations solely due to the presence of only 1 or 2 known deleted (though unrecycled) blocks from a very large index. All that matters now is keeping the costs and benefits in balance over time. Fix an issue that has been around since commit 857f9c36, which added the "skip full scan of index" mechanism (i.e. the _bt_vacuum_needs_cleanup() logic). The accuracy of btm_last_cleanup_num_heap_tuples accidentally hinged upon when the source value gets stored. We now always store btm_last_cleanup_num_heap_tuples in btvacuumcleanup(). This fixes the issue because IndexVacuumInfo.num_heap_tuples (the source field) is expected to accurately indicate the state of the table _after_ the VACUUM completes inside btvacuumcleanup(). A backpatchable fix cannot easily be extracted from this commit. A targeted fix for the issue will follow in a later commit, though that won't happen today. I (pgeoghegan) have chosen to remove any mention of deleted pages in the documentation of the vacuum_cleanup_index_scale_factor GUC/param, since the presence of deleted (though unrecycled) pages is no longer of much concern to users. The vacuum_cleanup_index_scale_factor description in the docs now seems rather unclear in any case, and it should probably be rewritten in the near future. Perhaps some passing mention of page deletion will be added back at the same time. Bump XLOG_PAGE_MAGIC due to nbtree WAL records using full XIDs now. Author: Peter Geoghegan pg@bowt.ie Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/CAH2-WznpdHvujGUwYZ8sihX=d5u-tRYhi-F4wnV2uN2zHpMUXw@mail.gmail.com https://git.postgresql.org/pg/commitdiff/e5d8a999030418a1b9e53d5f15ccaca7ed674877

  • VACUUM VERBOSE: Count "newly deleted" index pages. Teach VACUUM VERBOSE to report on pages deleted by the current VACUUM operation -- these are newly deleted pages. VACUUM VERBOSE continues to report on the total number of deleted pages in the entire index (no change there). The former is a subset of the latter. The distinction between each category of deleted index page only arises with index AMs where page deletion is supported and is decoupled from page recycling for performance reasons. This is follow-up work to commit e5d8a999, which made nbtree store 64-bit XIDs (not 32-bit XIDs) in pages at the point at which they're deleted. Note that the btm_last_cleanup_num_delpages metapage field added by that commit usually gets set to pages_newly_deleted. The exceptions (the scenarios in which they're not equal) all seem to be tricky cases for the implementation (of page deletion and recycling) in general. Author: Peter Geoghegan pg@bowt.ie Discussion: https://postgr.es/m/CAH2-WznpdHvujGUwYZ8sihX%3Dd5u-tRYhi-F4wnV2uN2zHpMUXw%40mail.gmail.com https://git.postgresql.org/pg/commitdiff/2376361839091b0dcdcc0b77f938b809b5f21646

David Rowley pushed:

  • Add TID Range Scans to support efficient scanning ranges of TIDs. This adds a new executor node named TID Range Scan. The query planner will generate paths for TID Range scans when quals are discovered on base relations which search for ranges on the table's ctid column. These ranges may be open at either end. For example, WHERE ctid >= '(10,0)'; will return all tuples on page 10 and over. To support this, two new optional callback functions have been added to table AM. scan_set_tidrange is used to set the scan range to just the given range of TIDs. scan_getnextslot_tidrange fetches the next tuple in the given range. For AMs were scanning ranges of TIDs would not make sense, these functions can be set to NULL in the TableAmRoutine. The query planner won't generate TID Range Scan Paths in that case. Author: Edmund Horner, David Rowley Reviewed-by: David Rowley, Tomas Vondra, Tom Lane, Andres Freund, Zhihong Yu Discussion: https://postgr.es/m/CAMyN-kB-nFTkF=VA_JPwFNo08S0d-Yk0F741S2B7LDmYAi8eyA@mail.gmail.com https://git.postgresql.org/pg/commitdiff/bb437f995d47405ecd92cf66df71f7f7e40ed460

  • Add missing TidRangeScan readfunc. Mistakenly forgotten in bb437f995 https://git.postgresql.org/pg/commitdiff/977b2c08535f2a82ba7c310c88420cbbca1772e8

Noah Misch pushed:

Pending Patches

Justin Pryzby sent in another revision of a patch to make INSERT SELECT use BulkInsertState and multi_insert, check for volatile defaults to ensure that any dependencies on them not be lost, make COPY flush the multi-insert buffer based on accumulated size of tuples, rather than line length, and check tuple size for a more accurate measure of chunk size when computing when to flush the buffer.

Hou Zhijie sent in another revision of a patch to add one GUC and one per-table option, both named enable_parallel_dml, to control whether DMLs include an option to execute in parallel.

Bharath Rupireddy sent in another revision of a patch to add GUCs both at the FDW level and at the foreign server level called keep_connections.

Masahiko Sawada sent in a patch to add a check whether or not to do index vacuum (and heap vacuum) based on whether or not 1% of all heap pages have an LP_DEAD line pointer.

Shenhao Wang sent in a patch to make --enable-coverage succeed without finding lcov, as the actual coverage tests can run without it.

Jim Mlodgenski sent in a patch to add a parser hook.

Mats Kindahl sent in a patch to a callback to TableAccessMethod that is called when the table should be scheduled for unlinking, and implements the method for the heap access method.

Justin Pryzby sent in three more revisions of a patch to report text parameters during errors in typinput, and exercise parameter output on error with binary parameters.

Daniel Gustafsson sent in two more revisions of a patch to make it possible to use NSS for libpq's TLS backend.

Jan Wieck sent in another revision of a patch to make the wire protocol pluggable and use same to answer via telnet.

Justin Pryzby sent in another revision of a patch to touch up the documentation for the upcoming release.

Iwata Aya and Álvaro Herrera traded patches to improve libpq tracing capabilities.

Amit Kapila sent in a patch to update the docs and comments for decoding of prepared xacts to match the current behavior.

Daniel Gustafsson sent in another revision of a patch to check the version of target cluster binaries in pg_upgrade.

Mark Rofail sent in another revision of a patch to implement foreign key arrays.

Matthias van de Meent sent in another revision of a patch to add progress-reported components for COPY progress reporting including a new view, pg_stat_progress_copy, add backlinks to progress reporting documentation, and add regression tests for same.

Dilip Kumar sent in three more revisions of a patch to provide a new interface to get the recovery pause status, pg_get_wal_replay_pause_state, that returns the actual status of the recovery pause i.e.'not paused' if pause is not requested, 'pause requested' if pause is requested but recovery is not yet paused and 'paused' if recovery is actually paused.

KaiGai Kohei sent in a patch to add binary input/output handlers to contrib/cube.

Georgios Kokolatos sent in another revision of a patch to make dbsize more consistent.

Mark Dilger sent in another revision of a patch to add pg_amcheck, a command line interface for running amcheck's verifications against tables and indexes.

John Naylor sent in two more revisions of a patch to make it possible to verify utf-8 using SIMD instructions.

Hayato Kuroda sent in three revisions of a patch to refactor ECPGconnect and allow IPv6 connections there.

Amit Langote, Greg Nancarrow, and Amit Kapila traded patches to make it possible to execute INSERT (INTO ... SELECT ...) with multiple workers.

Julien Rouhaud sent in another revision of a patch to add a new COLLATION option to REINDEX.

John Naylor sent in two revisions of a patch to allow inserting tuples into almost-empty pages.

Paul Martinez sent in two more revisions of a patch to document the effect of max_replication_slots on the subscriber side.

Ajin Cherian and Amit Kapila traded patches to avoid repeated decoding of prepared transactions after the restart, and add an option to enable two-phase commits in pg_create_logical_replication_slot.

Peter Eisentraut sent in another revision of a patch to fix use of cursor sensitivity terminology to match that in the SQL standard, removes the claim that sensitive cursors are supported, and adds a new option, ASENSITIVE, to cursors, that being the default behavior.

Benoit Lobréau sent in a patch to document in more detail how archive_command fails based on the signal it was sent, and whether it's reported in pg_stat_archiver.

Peter Eisentraut sent in another revision of a patch to set SNI for SSL connections from the client, which allows an SNI-aware proxy to route connections.

Peter Smith sent in three more revisions of a patch to implement logical decoding of two-phase transactions.

Amit Kapila sent in another revision of a patch to update documentation of logical replication to include the recently added logical replication configuration settings, and mention the fact that table synchronization workers are now using replication origins to track progress.

Thomas Munro sent in another revision of a patch to replace buffer I/O locks with condition variables.

Amit Langote sent in another revision of a patch to fix a misbehavior of partition row movement by ensuring that foreign key triggers are created on partitioned tables, and use same to enforce foreign keys correctly during cross-partition updates.

Thomas Munro sent in another revision of a patch to prevent latches from sending signals to processes that aren't currently sleeping, use SIGURG rather than SIGUSR1 for latches, use signalfd for epoll latches, which cuts down on system calls and other overheads by waiting on a signalfd instead of a signal handler and self-pipe, and use EVFILT_SIGNAL for kqueue latches.

Michaël Paquier sent in a patch to add a --tablespace option to reindexdb, matching the recently added capability for REINDEX.

Kota Miyake sent in a patch to fix pgbench's reporting of database name in errors when both PGUSER and PGPORT are set.

Amul Sul sent in another revision of a patch to implement wal prohibit state using a global barrier, error or Assert before START_CRIT_SECTION for WAL write, and document same.

Justin Pryzby sent in another revision of a patch to make it possible to use CREATE INDEX CONCURRENTLY on a partitioned table.

Jacob Champion sent in another revision of a patch to save the user's original authenticated identity for logging.

Daniel Gustafsson sent in another revision of a patch to disallow SSL compression by ignoring the option that would have turned it on. A later patch will remove the option entirely, now that it's deprecated.

Daniel Gustafsson sent in a patch to remove the defaults from libpq's authtype parameter, as it has been deprecated.

Álvaro Herrera sent in another revision of a patch to implement ALTER TABLE .. DETACH PARTITION CONCURRENTLY.

Dilip Kumar sent in two more revisions of a patch to make it possible to set the compression type for a table.

Euler Taveira de Oliveira sent in another revision of a patch to implement row filtering for logical replication using an optional WHERE clause in the DDL for PUBLICATIONs.

Thomas Munro sent in another revision of a patch to introduce symbolic names for FeBeWaitSet positions, and use FeBeWaitSet for walsender.c.

Thomas Munro sent in another revision of a patch to use condition variables for ProcSignalBarriers, allow condition variables to be used in interrupt code, and use a global barrier to fix DROP TABLESPACE on Windows by making it by force all backends to close all fds on that platform.

Andrey Borodin sent in a patch to use different compression methods for FPI.

Julien Rouhaud sent in a patch to change the explicit alignment use in pg_prewarm and pg_stat_statements to CACHELINEALIGN, and updates the alignment in hash_estimate_size() to an estimate of what ShmemInitHash will actually consume based on CACHELINEALIGN.

Thomas Munro sent in a patch to remove latch.c workaround for Linux < 2.6.27.

Peter Eisentraut sent in another revision of a patch to psql which makes it show all query results by default.

Jeff Janes sent in a patch to make SCRAM's behavior match MD5's by reporting in a DETAIL message when the password does not match for a user.

Joel Jacobson sent in a patch to implement a regexp_positions() function.

Paul Förster sent in a patch to mention database URIs in psql's --help output.

Justin Pryzby sent in a patch to refactor ATExec{En,Dis}ableRowSecurity in the style of ATExecForceNoForceRowSecurity, and do some further refactoring.

Justin Pryzby sent in a patch to implement ALTER TABLE SET TABLE ACCESS METHOD.

Kategóriák: Informatika

pgagroal 1.2.0

2021, február 25 - 01:00

The pgagroal community is happy to announce version 1.2.0.

New features

  • Allow users connecting to pgagroal to have different passwords than passwords used for the PostgreSQL connections

Various enhancements and bug fixes.

pgagroal

pgagroal is a high-performance protocol-native connection pool for PostgreSQL.

Features

  • High performance
  • Connection pool
  • Limit connections for users and databases
  • Prefill support
  • Remove idle connections
  • Perform connection validation
  • Enable / disable database access
  • Graceful / fast shutdown
  • Prometheus support
  • Remote management
  • Authentication query support
  • Failover support
  • Transport Layer Security (TLS) v1.2+ support
  • Daemon mode
  • User vault

Learn more on our web site or GitHub. Follow on Twitter.

pgagroal is released under the 3-clause BSD license, and is sponsored by Red Hat.

Kategóriák: Informatika

Database Lab Engine 2.2.0 and Joe Bot 0.9.0

2021, február 25 - 01:00
About Database Lab Engine

The Database Lab Engine (DLE) is an open-source experimentation platform for PostgreSQL databases. The DLE instantly creates full-size thin clones of your production database which you can use to:

  1. Test database migrations
  2. Optimize SQL queries
  3. Deploy full-size staging applications

The Database Lab Engine can generate thin clones for any size database, eliminating the hours (or days!) required to create “thick” database copies using conventional methods. Thin clones are independent, fully writable, and will behave identically to production: they will have the same data and will generate the same query plans.

Learn more about the Database Lab Engine and sign up for an account at https://postgres.ai/.

Database Lab Engine 2.2.0

Database Lab Engine (DLE) 2.2.0 further improves support for both types of PostgreSQL data directory initialization and synchronization: “physical” and “logical”. Particularly, for the “logical” type (which is useful for managed cloud PostgreSQL such as Amazon RDS users), it is now possible to setup multiple disks or disk arrays and automate data retrieval on a schedule. This gracefully cleans up the oldest versions of data, without downtime or interruptions in the lifecycle of clones.

Other improvements include:

  • Auto completion for the client CLI (“dblab”)
  • Clone container configuration — Docker parameters now can be defined in DLE config (such as --shm--size that is needed to avoid errors in newer versions of Postgres when parallel workers are used to process queries)
  • Allow requesting a clone with non-superuser access — This appears as a new option in the API and CLI called “restricted”

Database Lab Engine links:

Joe Bot 0.9.0 - A Virtual DBA for SQL Optimization

“Joe Bot”, a virtual DBA for SQL optimization, is a revolutionary new way to troubleshoot and optimize PostgreSQL query performance. Instead of running EXPLAIN or EXPLAIN (ANALYZE, BUFFERS) directly in production, users send queries for troubleshooting to Joe Bot. Joe Bot uses the Database Lab Engine (DLE) to:

  • Generate a fresh thin clone
  • Execute the query on the clone
  • Return the resulting execution plan to the user

The returned plan is identical to production in terms of structure and data volumes – this is achieved thanks to two factors:

  • thin clones have the same data and statistics as production (at a specified point in time), and
  • the PostgreSQL planner configuration on clones matches the production configuration.

Joe Bot users not only get reliable and risk-free information on how a query will be executed on production but also they can easily apply any changes to their own thin clones and see how query behavior is affected. For example, it is possible to add a new index and see if it actually helps to speed up the query.

One key aspect of Joe Bot, is the fact that users do not see the data directly, they only work with metadata. Therefore, teams without access to production data can be granted permissions to use this tool [1]

The main change in Joe Bot 0.9.0 is improved security: in past versions, DB superuser was used. Now a non-superuser is used for all requests. This makes it impossible to use plpythonu, COPY TO PROGRAM, FDW, or dblink to perform a massive copy of data outside infrastructructure which is not well protected by a strict firewall. All users are strongly recommended to upgrade as soon as possible.

Another major new feature is the production duration estimator, currently in an “experimental” state. This feature is intended to help users understand how long a specific operation - for example, an index creation operation - will actually take on the production database, which is likely to have a different physical infrastructure (for example a different filesystem, more RAM, and/or more CPU cores) than the thin clone running on the DLE. Read more: “Query duration difference between Database Lab and production environments”.

SQL Optimization Chatbot “Joe Bot” links:

[1] Although only metadata is returned from Joe Bot, it is possible to probe data for specific values using EXPLAIN ANALYZE. Please consult security experts in your organization before providing Joe Bot to people without production-level access.

Both Joe Bot and Database Lab Engine are distributed based on OSI-approved license (AGPLv3).

Your feedback is highly appreciated:

Kategóriák: Informatika

pgAdmin 4 v5.0 Released

2021, február 25 - 01:00

The pgAdmin Development Team are pleased to announce pgAdmin 4 version 5.0. This release of pgAdmin 4 includes 31 bug fixes and new features. For more details please see the release notes.

pgAdmin is the leading Open Source graphical management tool for PostgreSQL. For more information, please see the website.

Notable changes in this release include:

Features:
  • New Desktop Runtime (Using NWjs):

    The Desktop Runtime is now based on NWjs which integrates a browser and the Python server creating a standalone application. By implementing it using NWjs we get rid of the separate server application and the independent browser. We also get rid of QT and C++ runtime logic.

    There are two minor known issues with this feature (6255 and 6258), both of which are due to bugs in NWjs itself; Users on macOS should use the application menu to exit pgAdmin, rather than quitting from the Dock icon to avoid the first issue. The second issue may cause Windows users to see a red square instead of the normal application icon in some circumstances.

  • Logical Replication support:

    Logical replication uses a publish and subscribe model with one or more subscribers subscribing to one or more publications on a publisher node. We have added support for logical replication by introducing new treeview nodes and dialogues with which users can easily create/alter/delete publications and subscriptions. Support is also included in the Schema Diff tool.

  • Quick Search functionality:

    Added a quick search option in the Help menu to search menu items and help articles. Type at least three characters to display all the matching possibilities under Menu items and the relevant documents under Help articles.

  • Make Statistics, Dependencies, Dependants tabs closable. Users can add them back using the 'Add panel' option on the context menu for the tab strip.

  • When running in Docker/Kubernetes, ensure logs are not stored in the container, and only sent to the console.
Bugs/Housekeeping:
  • Use cheroot as the default production server for pgAdmin4
  • Updated Javascript dependencies to the latest versions
  • Fixed an issue where the focus is not properly set on the filter text editor after closing the error dialog.
  • Fixed an issue where the dependencies tab shows multiple owners for the objects having shared dependencies.
  • Fixed an issue where the Zoom to fit button in the ERD Tool only works if the diagram is larger than the canvas.
  • Fixed an issue where the user was unable to change the background color for a server.
  • Fixed an issue where external utility jobs (backup, maintenance etc.) are failing when the log level is set to DEBUG.
  • Ensure DEB/RPM packages depend on the same version of each other.
  • Fixed an autocomplete issue where it is not showing any suggestions if the schema name contains escape characters.

Builds for Windows and macOS are available now, along with a Python Wheel, Docker Container, RPM, DEB Package, and source code tarball from the tarball area.

Kategóriák: Informatika

Oldalak

Theme by me