 
        | Type: | BUG | 
| Impact: | NA | 
| Release Date: | 2015-07-19 | 
[4.3.0-55.el6.47.3]
- x86: vcpu_destroy_pagetables() must not return -EINTR
  .. otherwise it has the side effect that: domain_relinquish_resources
  will stop and will return to user-space with -EINTR which it is not
  equipped to deal with that error code; or vcpu_reset - which will
  ignore it and convert the error to -ENOMEM..
  The preemption mechanism we have for domain destruction is to return
  -EAGAIN (and then user-space calls the hypercall again) and as such we need
  to catch the case of:
  domain_relinquish_resources
  ->vcpu_destroy_pagetables
  -> put_page_and_type_preemptible
  -> __put_page_type
  returns -EINTR
  and convert it to the proper type. For:
  XEN_DOMCTL_setvcpucontext
  -> vcpu_reset
  -> vcpu_destroy_pagetables
  we need to return -ERESTART otherwise we end up returning -ENOMEM.
  There are also other callers of vcpu_destroy_pagetables: arch_vcpu_reset
  (vcpu_reset) are:
  - hvm_s3_suspend (asserts on any return code),
  - vlapic_init_sipi_one (asserts on any return code),
  Signed-off-by: Konrad Rzeszutek Wilk 
  Signed-off-by: Jan Beulich 
  Acked-by: Chuck Anderson 
[4.3.0-55.el6.47.2]
- mm: Make scrubbing a low-priority task
  An idle processor will attempt to scrub pages left over by a previously
  exited guest. The processor takes global heap_lock in scrub_free_pages(),
  manipulates pages on the heap lists and releases the lock before performing
  the actual scrubbing in __scrub_free_pages().
  It has been observed that on some systems, even though scrubbing itself
  is done with the lock not held, other unrelated heap users are unable
  to take the (now free) lock. We theorize that massive scrubbing locks out
  the bus (or some other HW resources), preventing lock requests from reaching
  the scrubbing node.
  This patch tries to alleviate this problem by having the scrubber monitor
  whether there are other waiters for the heap lock and, if such waiters
  exist, stop scrubbing.
  To achieve this, we make two changes to existing code:
  1. Parallelize the heap lock by breaking it to per-node locks
  2. Create an atomic per-node counter array. Before a CPU on a particular
  node attempts to acquire the (now per-node) lock it increments the counter.
  The scrubbing processor periodically checks this counter and, if it is
  non-zero, stops scrubbing.
  Few notes:
  1. Until now, total_avail_pages and midsize_alloc_zone_pages updates have been
  performed under global heap_lock which was also used to control access to heap.
  Since now those accesses are guarded by per-node locks, we introduce heap_lock_global.
  Note that this is really only to protect readers of this variables from reading
  inconsistent values (such as if another CPU is in the middle of updating them).
  The values themselves are somewhat 'unsynchronized' from actual heap state. We
  try to be conservative and decrement them before pages are taken from the heap
  and increment them after they are placed there.
  2. Similarly, page_broken/offlined_list are no longer under heap_lock.
  pglist_lock is added to synchronize access to those lists.
  3. d->last_alloc_node used to be updated under heap_lock. It was read, however,
  without holding this lock so it seems that lockless updates will not make the
  situation any worse (and since these updates are simple writes, as opposed to
  some sort of RMW, we shouldn't need to convert it to an atomic).
  Signed-off-by: Boris Ostrovsky 
  Reviewed-by: Konrad Rzeszutek Wilk 
  Acked-by: Chuck Anderson 
[4.3.0-55.el6.47.1]
- IOMMU: make page table deallocation preemptible
  Backport of cedfdd43a97.
  We are spending lots of time flushing CPU cache, one PTE at a time, to
  make sure that IOMMU (which may not be able to watch coherence traffic
  on the bus) doesn't load stale PTE from memory.
  For guests with lots of memory (say, >512GB) this may take as much as
  half a minute or more and as result (because this is a non-preemptable
  operation) things start to break down.
  Below is the original commit message:
  This too can take an arbitrary amount of time.
  In fact, the bulk of the work is being moved to a tasklet, as handling
  the necessary preemption logic in line seems close to impossible given
  that the teardown may also be invoked on error paths.
  Signed-off-by: Jan Beulich 
  Reviewed-by: Andrew Cooper 
  Acked-by: Xiantao Zhang 
  Signed-off-by: Boris Ostrovsky 
  Acked-by: Chuck Anderson 
[4.3.0-55.el6.47]
- Use AUTO_PHP_SLOT as virtual devfn for rebooted pvhvm guest
  Xend try to get vdevfn from dictionary and use it as vdevfn for reboot.
  In first boot, if simulated nic is unplugged before passthroughed device hotplug,
  and in reboot, the order is reversed, there will be a conflict of vdevfn.
  qemu.log shows 'hot add pci devfn -2 exceed.'
  This patch can't be upstreamed as upstream has dropped 'xend' completely.
  Signed-off-by: Zhenzhong Duan 
  Signed-off-by: Chuang Cao 
  Signed-off-by: Wengang Wang 
  Acked-by: Konrad Rzeszutek Wilk
[4.3.0-55.el6.46]
- xend: disable vbd discard feature for file type backend
  Signed-off-by: Zhigang Wang 
  Reviewed-by: Konrad Rzeszutek Wilk 
[4.3.0-55.el6.39]
- xend: fix python fork and log consume %100 cpu issue
  It is caused by python internal bug: http://bugs.python.org/issue6721 .
  When xend forks subprocess then calls logging function, deadlock occurred.
  Because python has no fix yet, so remove the logging.debug() call in
  XendBootloader.py to workaround it.
  Signed-off-by: Joe Jin 
  Reviewed-by: Zhigang Wang 
[4.3.0-55.el6.38]
- Xen: Fix migration issue from ovm3.2.8 to ovm3.3.x
  This patch is a newer fix for pvhvm migration failure from
  Xen4.1(ovm3.2.x) to Xen4.3(ovm3.3.x), and this issue exists in
  upstream xen too. The original fix casues issue for released ovm
  versions if user wants to do live migration with no downtime since
  that fix requires rebooting the migration source server too.
  This patch keeps the xenstore eventchannel allcation mechanism of
  Xen4.3 as same as the one in Xen4.1. So migration can works well through
  Xen4.1 to later Xen, no need to reboot  migration source server.
  The patch that causes this migration issue is,
  http://lists.xen.org/archives/html/xen-devel/2011-11/msg01046.html
  Signed-off-by: Annie Li 
  Acked-by: Adnan Misherfi 
[4.3.0-55.el6.37]
- switch internal hypercall restart indication from -EAGAIN to -ERESTART
 
  -EAGAIN being a return value we want to return to the actual caller in
  a couple of cases makes this unsuitable for restart indication, and x86
  already developed two cases where -EAGAIN could not be returned as
  intended due to this (which is being fixed here at once).
 
  Signed-off-by: Jan Beulich 
  Acked-by: Ian Campbell 
  Reviewed-by: Tim Deegan 
  (cherry-pick from f5118cae0a7f7748c6f08f557e2cfbbae686434a)
  Signed-off-by: Konrad Rzeszutek Wilk 
  Conflicts:
  A LOT
  [There are lot of changes to for this change. We only care about the
  one in the domain destruction. We need the value -EAGAIN to be passed
  in the toolstack so that it will retry the destruction. Any other
  value (-ERESTART) and it will stop it - which some of the other
  backports do we convert -ERESTART to -EAGAIN only].
  Acked-by: Chuck Anderson 
  Reviewed-by: John Haxby 
[4.3.0-55.el6.36]
- rc/xendomains: 'stop' - also take care of stuck guests.
  When we are done shutting down the guests (xm --shutdown --all)
  are at that point not running at all. They might still have
  QEMU or backend drivers setup due to the asynchronous nature
  of 'shutdown' process. As such doing an 'destroy' on all
  the guests will assure us that the backend drivers and QEMU
  are indeed stopped.
  The mechanism by which 'shutdown' works is quite complex. There
  are three actors at play:
  a) xm client (Which connects to the XML RPC),
  b) Xend Xenstore watch thread,
  c) XML RPC server thread
  The way shutdown starts is:
  xm client                |  XML RPC          | watch thread
  shutdown.py
  - server....shutdown  ---|--> XenDomainInfo:shutdown
  Sets 'control/shutdown'
  calls xc.domain_shutdown
  returns
  - loops calling:
  domains_with_state ----|-->XendDomain:list_names
  gets active   |
  and inactive    | watchMain
  list             _on_domains_changed
  - _refresh
  -> _refreshTxn
  -> update [sets to
  DOM_STATE_SHUTDOWN]
  ->refreshShutd
  own
  [spawns a ne
  w thread calling _maybeRestart]
  [_maybeRestart thread]:
  destroy
  [sets it to DOM_STATE_HALTED]
  -cleanupDomain
  - _releaseDevices
  - ..
  Four threads total.
  There is a race between 'watchMain' being executed and 'domains_with_state'
  calling 'list_names'. For guests that are in DOM_STATE_UNKNOWN or DOM_STATE_PAUS
  ED
  they might not be updated to DOM_STATE_SHUTDOWN as list_names can be called
  _before_ watchMain triggers. There is an lock acquisition to call 'refresh'
  in list_names - but if it fails - it will just use the stale list.
  As such the process works great for guests that are in STATE_SHUTDOWN,
  STATE_HALT, or STATE_RUNNING - which 'domains_with_state' will present
  to shutdown process.
  For the other states (The more troublesome ones) we might have them
  still laying around.
  As such this patch calls 'xm destroy' on all those remaining guests
  to do cleanup.
  Signed-off-by: Konrad Rzeszutek Wilk 
  Acked-by: Chuck Anderson 
  Reviewed-by: John Haxby 
[4.3.0-55.el6.35]
- xend: Fix race between shutdown and cleanup.
  When we invoke 'xm shutdown --wait --all' we will exit the moment
  the guest has stopped executing. That is when xcinfo returns
  shutdown=1. However that does not mean that all the infrastructure
  around the guest has been torn down - QEMU can be still running,
  Netback and Blkback as well. In the past the time between
  the shutdown and qemu being disposed of was quick - however
  the race was still present there.
  With our usage of PCIe passthrough we MUST unbind those devices
  from a guest before we can continue on with the reboot of
  the system. That is due to the complex interaction the SR-IOV
  devices have with VF and PFs - as you cannot unload the PF driver
  before the VFs driver have been unbound from the guest.
  If you try to reboot the machine at this point the PF driver
  will not unload.
  The VF drivers are bound to Xen pciback - and they are unbound
  when QEMU is stopped and XenStore keys are torn down - which
  is done _after_ the 'shutdown' xcinfo is set (in the cleanup
  stage). Worst the Xen blkback is still active - which means
  we cannot unmount the storage until said cleanup has finished.
  But as mentioned - 'xm shutdown --wait --all' would happily
  exit before the cleanup finished and the shutdown (or reboot)
  of the initial domain would continue on. It would eventually
  get wedged when trying to unmount the storage which still
  had a refcount from Xen block driver - which was not cleaned up
  as Xend was killed earlier.
  This patch solves this by delaying 'xm shutdown --wait --all'
  to wait until the guest has transitioned from RUNNING ->
  SHUTDOWN  -> HALTED stage. The SHUTDOWN means it has ceased
  to execute. The HALTED is that the cleanup is being performed.
  We will cycle through all of the guests in that state until
  they have moved out of those states (removed completly from
  the system).
  Signed-off-by: Konrad Rzeszutek Wilk 
  Acked-by: Chuck Anderson 
  Reviewed-by: John Haxby 
[4.3.0-55.el6.22]
- hvmloader: don't use AML operations on 64-bit fields
  WinXP and Win2K3, while having no problem with the QWordMemory resource
  (there was another one there before), don't like operations on 64-bit
  fields. Split the fields d0688669 ('hvmloader: also cover PCI MMIO
  ranges above 4G with UC MTRR ranges') added to 32-bit ones, handling
  carry over explicitly.
  Sadly the constructs needed to create the sub-fields - nominally
  CreateDWordField(PRT0, _SB.PCI0._CRS._Y02._MIN, MINL)
  CreateDWordField(PRT0, Add(_SB.PCI0._CRS._Y02._MIN, 4), MINH)
  - can't be used: The former gets warned upon by newer iasl, i.e. would
  need to be replaced by the latter just with the addend changed to 0,
  and the latter doesn't translate properly with recent iasl). Hence,
  short of having an ASL/iasl expert at hand, we need to work around the
  shortcomings of various iasl versions. See the code comment.
  Signed-off-by: Jan Beulich 
  Acked-by: Ian Campbell 
  (cherry picked from commit 7f8d8abcf6dfb85fae591a547b24f9b27d92272c)
  Signed-off-by: Konrad Rzeszutek Wilk 
  Committed-by: Zhenzhong Duan 
[4.3.0-55.el6.21]
- hvmloader: fix build with certain iasl versions
  While most of them support what we have now, Wheezy's dislikes the
  empty range. Put a fake one in place - it's getting overwritten upon
  evaluation of _CRS anyway.
  The range could be grown (downwards) if necessary; the way it is now
  it is
  - the highest possible one below the 36-bit boundary (with 36 bits
  being the lowest common denominator for all supported systems),
  - the smallest possible one that said iasl accepts.
  Reported-by: Sander Eikelenboom 
  Signed-off-by: Jan Beulich 
  Acked-by: Ian Campbell 
  (cherry picked from commit 119d8a42d3bfe6ebc1785720e1a7260e5c698632)
  Signed-off-by: Konrad Rzeszutek Wilk 
  Committed-by: Zhenzhong Duan 
[4.3.0-55.el6.20]
- hvmloader: also cover PCI MMIO ranges above 4G with UC MTRR ranges
  When adding support for BAR assignments to addresses above 4G, the MTRR
  side of things was left out.
  Additionally the MMIO ranges in the DSDT's _SB.PCI0._CRS were having
  memory types not matching the ones put into MTRRs: The legacy VGA range
  is supposed to be WC, and the other ones should be UC.
  Signed-off-by: Jan Beulich 
  Acked-by: Ian Campbell 
  (cherry picked from commit d06886694328a31369addc1f614cf326728d65a6)
  Signed-off-by: Konrad Rzeszutek Wilk 
  Committed-by: Zhenzhong Duan 
[4.3.0-55.el6.19]
- Add 64-bit support to QEMU.
  Currently it is assumed PCI device BAR access < 4G memory. If there is such a
  device whose BAR size is larger than 4G, it must access > 4G memory address.
  This patch enable the 64bits big BAR support on qemu-xen.
  Signed-off-by: Xiantao Zhang 
  Signed-off-by: Xudong Hao 
  Tested-by: Michel Riviere 
  Signed-off-by: Zhenzhong Duan
  Signed-off-by: Konrad Rzeszutek Wilk 
  Committed-by: Zhenzhong Duan 
[4.3.0-55.el6.18]
- tasklet: Introduce per-cpu tasklet for softirq (v5)
  This implements a lockless per-cpu tasklet mechanism.
  The existing tasklet mechanism has a single global
  spinlock that is taken every-time the global list
  is touched. And we use this lock quite a lot - when
  we call do_tasklet_work which is called via an softirq
  and from the idle loop. We take the lock on any
  operation on the tasklet_list.
  The problem we are facing is that there are quite a lot of
  tasklets scheduled. The most common one that is invoked is
  the one injecting the VIRQ_TIMER in the guest. Guests
  are not insane and don't set the one-shot or periodic
  clocks to be in sub 1ms intervals (causing said tasklet
  to be scheduled for such small intervalls).
  The problem appears when PCI passthrough devices are used
  over many sockets and we have an mix of heavy-interrupt
  guests and idle guests. The idle guests end up seeing
  1/10 of its RUNNING timeslice eaten by the hypervisor
  (and 40% steal time).
  The mechanism by which we inject PCI interrupts is by
  hvm_do_IRQ_dpci which schedules the hvm_dirq_assist
  tasklet every time an interrupt is received.
  The callchain is:
  _asm_vmexit_handler
  -> vmx_vmexit_handler
  ->vmx_do_extint
  -> do_IRQ
  -> __do_IRQ_guest
  -> hvm_do_IRQ_dpci
  tasklet_schedule(&dpci->dirq_tasklet);
  [takes lock to put the tasklet on]
  [later on the schedule_tail is invoked which is 'vmx_do_resume']
  vmx_do_resume
  -> vmx_asm_do_vmentry
  -> call vmx_intr_assist
  -> vmx_process_softirqs
  -> do_softirq
  [executes the tasklet function, takes the
  lock again]
  While on other CPUs they might be sitting in a idle loop
  and invoked to deliver an VIRQ_TIMER, which also ends
  up taking the lock twice: first to schedule the
  v->arch.hvm_vcpu.assert_evtchn_irq_tasklet (accounted to
  the guests' BLOCKED_state); then to execute it - which is
  accounted for in the guest's RUNTIME_state.
  The end result is that on a 8 socket machine with
  PCI passthrough, where four sockets are busy with interrupts,
  and the other sockets have idle guests - we end up with
  the idle guests having around 40% steal time and 1/10
  of its timeslice (3ms out of 30 ms) being tied up
  taking the lock. The latency of the PCI interrupts delieved
  to guest is also hindered.
  With this patch the problem disappears completly.
  That is removing the lock for the PCI passthrough use-case
  (the 'hvm_dirq_assist' case).
  As such this patch introduces the code to setup
  softirq per-cpu tasklets and only modifies the PCI
  passthrough cases instead of doing it wholesale. This
  is done because:
  - We want to easily bisect it if things break.
  - We modify the code one section at a time to
  make it easier to review this core code.
  Now on the code itself. The Linux code (softirq.c)
  has an per-cpu implementation of tasklets on which
  this was based on. However there are differences:
  - This patch executes one tasklet at a time - similar
  to how the existing implementation does it.
  - We use a double-linked list instead of a single linked
  list. We could use a single-linked list but folks are
  more familiar with 'list_*' type macros.
  - This patch does not have the cross-CPU feeders
  implemented. That code is in the patch
  titled: tasklet: Add cross CPU feeding of per-cpu
  tasklets. This is done to support:
  'tasklet_schedule_on_cpu'
  - We add an temporary 'TASKLET_SOFTIRQ_PERCPU' which
  is can co-exist with the TASKLET_SOFTIRQ. It will be
  replaced in 'tasklet: Remove the old-softirq
  implementation.'
  Signed-off-by: Konrad Rzeszutek Wilk 
  Acked-by: Adnan Misherfi 
  Backported-by: Joe Jin 
[4.3.0-55.el6.17]
- libxl/sysctl/ionuma: Make 'xl info -n' print device topology
  'xl info -n' will provide both CPU and IO topology information. Note
  that xend (i.e. 'xm' variant of this command) will continue to only
  print CPU topology.
  To minimize code changes, libxl_get_topologyinfo (libxl's old interface
  for topology) is preserved so its users (other than output_topologyinfo())
  are not modified.
  Signed-off-by: Boris Ostrovsky 
  Reviewed-by: Konrad Rzeszutek Wilk 
  Backported-by: Joe Jin 
[4.3.0-55.el6.16]
- pci: Manage NUMA information for PCI devices
  Keep track of device's PXM data (in the form of node ID)
  Signed-off-by: Boris Ostrovsky 
  Reviewed-by: Konrad Rzeszutek Wilk 
  Backported-by: Joe Jin 
[4.3.0-55.el6.15]
- libxl: ocaml: support for Arrays in bindings generator.
  No change in generated code because no arrays are currently generated.
  Signed-off-by: Ian Campbell 
  Signed-off-by: Rob Hoes 
  Acked-by: David Scott 
  Backported-by: Joe Jin 
[4.3.0-55.el6.14]
- Reduce domain destroy time by delay page scrubbing
  Because of page scrubbing, it's very slow to destroy a domain with large
  memory.
  This patch introduced a 'PGC_need_scrub' flag, pages with this flag means it
  need to be scrubbed before use.
  During domain destory, pages are marked as 'PGC_need_scrub' and be added to free
  heap list, so that xl can return quickly. The real scrub is delayed to the
  allocation path if a page with 'PGC_need_scrub' is allocated.
  Besides that, trigger all idle vcpus to do the scrub job in parallel before
  them enter sleep.
  In order to get rid of heavy lock contention, a percpu list is used:
  - Delist a batch of pages to a percpu list from 'scrub' free page list.
  - Scrub pages on this percpu list.
  - Return those clean pages to normal 'heap' free page list, merge with other
  chunks if needed.
  On a ~500GB guest, shutdown took slightly over one minute compared with over 6
  minutes if without this patch.
  Signed-off-by: Bob Liu 
  Acked-by: Adnan Misherfi 
  Signed-off-by: Konrad Rzeszutek Wilk 
  Backported-by: Joe Jin 
[4.3.0-55.el6.13]
- Revert 'pci: Manage NUMA information for PCI devices'
  Backport-by: Joe Jin 
[4.3.0-55.el6.12]
- Revert 'libxl/sysctl/ionuma: Make 'xl info -n' print device topology'
  Signed-off-by: Joe Jin 
[4.3.0-55.el6.11]
- libxl/sysctl/ionuma: Make 'xl info -n' print device topology
  'xl info -n' will provide both CPU and IO topology information. Note
  that xend (i.e. 'xm' variant of this command) will continue to only
  print CPU topology.
  To minimize code changes, libxl_get_topologyinfo (libxl's old interface
  for topology) is preserved so its users (other than output_topologyinfo())
  are not modified.
  Signed-off-by: Boris Ostrovsky 
  Reviewed-by: Konrad Rzeszutek Wilk 
  Backported-by: Joe Jin 
[4.3.0-55.el6.10]
- pci: Manage NUMA information for PCI devices
  Keep track of device's PXM data (in the form of node ID)
  Signed-off-by: Boris Ostrovsky 
  Reviewed-by: Konrad Rzeszutek Wilk 
  Backport-by: Joe Jin 
[4.3.0-55.el6.9]
- tools/python: expose xc_getcpuinfo()
  This API can be used to get per physical CPU utilization.
  Testing:
    >>> import xen.lowlevel.xc
  >>> xc = xen.lowlevel.xc.xc()
  >>> xc.getcpuinfo()
  Traceback (most recent call last):
  File '
  TypeError: Required argument 'max_cpus' (pos 1) not found
  >>> xc.getcpuinfo(4)
  [{'idletime': 109322086128854}, {'idletime': 109336447648802},
  {'idletime': 109069270544960}, {'idletime': 109065612611363}]
  >>> xc.getcpuinfo(100)
  [{'idletime': 109639015806078}, {'idletime': 109654551195681},
  {'idletime': 109382107891193}, {'idletime': 109382057541119}]
  >>> xc.getcpuinfo(1)
  [{'idletime': 109682068418798}]
  >>> xc.getcpuinfo(2)
  [{'idletime': 109711311201330}, {'idletime': 109728458214729}]
  >>> xc.getcpuinfo(max_cpus=4)
  [{'idletime': 109747116214638}, {'idletime': 109764982453261},
  {'idletime': 109491373228931}, {'idletime': 109489858724432}]
  Signed-off-by: Zhigang Wang 
  Acked-by: Ian Campbell 
  Upsteam commit: a9958947e49644c917c2349a567b2005b08e7c1f [bug 19707017]
[4.3.0-55.el6.8]
- xend: disable sslv3 due to CVE-2014-3566
  Signed-off-by: Zhigang Wang 
  Signed-off-by: Kurt Hackel 
  Signed-off-by: Adnan Misherfi 
  Backported-by: Chuang Cao 
[4.3.0-55.el6.7]
- xend: fix domain destroy after reboot
  Signed-off-by: Zhigang Wang 
  Signed-off-by: Joe Jin 
  Signed-off-by: Iain MacDonnell 
[4.3.0-55.el6.6]
- Keep the maxmem and memory same in vm.cfg
  Signed-off-by: Annie Li 
  Signed-off-by: Adnan Misherfi 
  Signed-off-by: Joe Jin 
[4.3.0-55.el6.5]
- xen: Only allocating the xenstore event channel earlier
  This patch allocates xenstore event channel earlier to fix the migration
  issue from ovm3.2.8 to 3.3.1, and also reverts the change for console
  event channel to avoid it is set to none after allocation.
  Signed-off-by: Annie Li 
  Acked-by: Adnan Misherfi 
  Backported-by: Joe Jin 
[4.3.0-55.el6.4]
- Increase xen max_phys_cpus to support hardware with 384 CPUs
  Signed-off-by: Adnan Misherfi 
  Backported-by:  Adnan Misherfi 
[4.3.0-55.el6.3]
- Fix migration bug from OVM3.2.8(Xen4.1.3) to OVM3.3.1(Xen4.3.x)
  The pvhvm migration from ovm3.2.8 to ovm3.3.1 fails because xenstore event channel number changes,
  this patch allocate xenstore event channel as ealier as possible to avoid this issue.
  Signed-off-by: Annie Li 
  Backported-by: Joe Jin 
[4.3.0-55.el6.2]
- Fix the panic on HP DL580 Gen8.
  Signed-off-by: Konrad Wilk 
  Signed-off-by: Adnan Misherfi 
  Backported-by: Chuang Cao 
[4.3.0-55.el6.1]
- Before connecting the emulated network interface (vif.x.y-emu) to a bridge, change the emu MTU to
  equal the MTU of the bridge to prevent the bridge from downgrading its own MTU to equal the emu MTU.
  Signed-off-by: Adnan Misherfi 
  Backported-by: Chuang Cao 
[4.3.0-55]
- x86/HVM: use fixed TSC value when saving or restoring domain
 
  When a domain is saved each VCPU's TSC value needs to be preserved. To get it we
  use hvm_get_guest_tsc(). This routine (either itself or via get_s_time() which
  it may call) calculates VCPU's TSC based on current host's TSC value (by doing a
  rdtscll()). Since this is performed for each VCPU separately we end up with
  un-synchronized TSCs.
 
  Similarly, during a restore each VCPU is assigned its TSC based on host's current
  tick, causing virtual TSCs to diverge further.
 
  With this, we can easily get into situation where a guest may see time going
  backwards.
 
  Instead of reading new TSC value for each VCPU when saving/restoring it we should
  use the same value across all VCPUs.
 
  Reported-by: Philippe Coquard 
  Signed-off-by: Boris Ostrovsky 
  Reviewed-by: Jan Beulich 
  commit: 88e64cb785c1de4f686c1aa1993a0003b7db9e1a [bug 18755631]
[4.3.0-54]
- iommu: set correct IOMMU entries when iommu_hap_pt_share == 0
  If the memory map is not shared between HAP and IOMMU we fail to set
  correct IOMMU mappings for memory types other than p2m_ram_rw.
  This patchs adds IOMMU support for the following memory types:
  p2m_grant_map_rw, p2m_map_foreign, p2m_ram_ro, p2m_grant_map_ro and
  p2m_ram_logdirty.
  Signed-off-by: Roger Pau Monn?195?169 
  Cc: Tim Deegan 
  Cc: Jan Beulich 
  Tested-by: David Zhuang 
  ---
  Changes since v1:
  - Move the p2m type switch to IOMMU flags to an inline function that
  is shared between p2m-ept and p2m-pt.
  - Make p2m_set_entry also use p2m_get_iommu_flags.
  ---
  When backporting this patch it would not apply cleanly due to two commits
  not existing in the Xen 4.3 repo:
  commit 243cebb3dfa1f94ec7c2b040e8fd15ae4d81cc5a
  Author: Mukesh Rathor 
  Date:   Thu Apr 17 10:05:07 2014 +0200
  pvh dom0: introduce p2m_map_foreign
  [adds the p2m_map_foreign type]
  commit 3d8d2bd048773ababfa65cc8781b9ab3f5cf0eb0
  Author: Jan Beulich 
  Date:   Fri Mar 28 13:37:10 2014 +0100
  x86/EPT: simplification and cleanup
  [simplifies the loop in ept_set_entry]
  As such the original patch from
  http://lists.xen.org/archives/html/xen-devel/2014-04/msg02928.html
  has been slightly changed.
  Signed-off-by: Konrad Rzeszutek Wilk 
[4.3.0-53]
- x86/svm: enable TSC scaling
 
  TSC ratio enabling logic is inverted: we want to use it when we
  are running in native tsc mode, i.e. when d->arch.vtsc is zero.
 
  Also, since now svm_set_tsc_offset()'s calculations depend
  on vtsc's value, we need to call hvm_funcs.set_tsc_offset() after
  vtsc changes in tsc_set_info().
 
  In addition, with TSC ratio enabled, svm_set_tsc_offset() will
  need to do rdtsc. With that we may end up having TSCs on guest's
  processors out of sync. d->arch.hvm_domain.sync_tsc which is set
  by the boot processor can now be used by APs as reference TSC
  value instead of host's current TSC.
 
  Signed-off-by: Boris Ostrovsky 
  Reviewed-by: Jan Beulich 
  commit: b95fd03b5f0b66384bd7c190d5861ae68eb98c85 [bug 18755631]
[4.3.0-52]
- x86: use native RDTSC(P) execution when guest and host frequencies are the same
  We should be able to continue using native RDTSC(P) execution on
  HVM/PVH guests after migration if host and guest frequencies are
  equal (this includes the case when the frequencies are made equal
  by TSC scaling feature).
 
  This also allows us to revert main part of commit 4aab59a3 (svm: Do not
  intercept RDTSC(P) when TSC scaling is supported by hardware) which
  was wrong: while RDTSC intercepts were disabled domain's vtsc could
  still be set, leading to inconsistent view of guest's TSC.
 
  Signed-off-by: Boris Ostrovsky 
  Acked-by: Jan Beulich 
  commit: 82713ec8d2b65d17f13e46a131e38bfe5baf8bd6 [bug 18755631]
[4.3.0-50]
- Signed-off by: Adnan G Misherfi 
  Signed-off by: Zhigang Wang 
[4.3.0-49]
- Check in the following patch for Konrad:
  From Message-ID: <1332267691-13179-1-git-send-email-david.vrabel@citrix.com>
  If a maximum reservation for dom0 is not explictly given (i.e., no
  dom0_mem=max:MMM command line option), then set the maximum
  reservation to the initial number of pages.  This is what most people
  seem to expect when they specify dom0_mem=512M (i.e., exactly 512 MB
  and no more).
  This change means that with Linux 3.0.5 and later kernels,
  dom0_mem=512M has the same result as older, 'classic Xen' kernels. The
  older kernels used the initial number of pages to set the maximum
  number of pages and did not query the hypervisor for the maximum
  reservation.
  It is still possible to have a larger reservation by explicitly
  specifying dom0_mem=max:MMM.
  Signed-off-by: David Vrabel 
  Signed-off-by: Konrad Rzeszutek Wilk 
  NOTE: This behaviour should also be implemented in the Linux kernel. [bug 13860516] [bug 18552768]
[4.3.0-48]
- Check in the following patch for Konrad:
  From: Konrad Rzeszutek Wilk 
  When we migrate an HVM guest, by default our shared_info can
  only hold up to 32 CPUs. As such the hypercall
  VCPUOP_register_vcpu_info was introduced which allowed us to
  setup per-page areas for VCPUs. This means we can boot PVHVM
  guest with more than 32 VCPUs. During migration the per-cpu
  structure is allocated fresh by the hypervisor (vcpu_info_mfn
  is set to INVALID_MFN) so that the newly migrated guest
  can do make the VCPUOP_register_vcpu_info hypercall.
  Unfortunatly we end up triggering this condition:
  /* Run this command on yourself or on other offline VCPUS. */
  if ( (v != current) && !test_bit(_VPF_down, &v->pause_flags) )
  which means we are unable to setup the per-cpu VCPU structures
  for running vCPUS. The Linux PV code paths make this work by
  iterating over every vCPU with:
  1) is target CPU up (VCPUOP_is_up hypercall?)
  2) if yes, then VCPUOP_down to pause it.
  3) VCPUOP_register_vcpu_info
  4) if it was down, then VCPUOP_up to bring it back up
  But since VCPUOP_down, VCPUOP_is_up, and VCPUOP_up are
  not allowed on HVM guests we can't do this. This patch
  enables this.
  Signed-off-by: Konrad Rzeszutek Wilk 
[4.3.0-46]
- The flowing patch was missed when we upgraded OVM xen to 4.3:
  From 5eda9dfe0a2e11d9c91717f83ddbb2f52e7535e7 Mon Sep 17 00:00:00 2001
  From: Zhenzhong Duan 
  Date: Fri, 4 Apr 2014 15:36:36 -0400
  Subject: [PATCH] qemu-xen-trad: free all the pirqs for msi/msix when driver
  unloads
  Pirqs are not freed when driver unloads, then new pirqs are allocated when
  driver reloads. This could exhaust pirqs if do it in a loop.
  This patch fixes the bug by freeing pirqs when ENABLE bit is cleared in
  msi/msix control reg.
  There is also other way of fixing it such as reuse pirqs between driver reload,
  but this way is better.
  Xen-devel: http://marc.info/?l=xen-devel&m=136800120304275&w=2
  Signed-off-by: Zhenzhong Duan 
  Signed-off-by: Konrad Rzeszutek Wilk 
[4.3.0-45]
- check in upstream dd03048 patch to add support for OL7 VM [bug 18487695]
[4.3.0-44]
- Just release running lock after a domain is gone.
  Signed-off-by: Chuang Cao 
  Signed-off-by: Zhigang Wang 
  Acked-by: Konrad Rzeszutek Wilk 
  Acked-by: Adnan Misherfi 
  Acked-by: Julie Trask 
[4.3.0-43]
- Backport xen patch 'reset TSC to 0 after domain resume from S3' [bug 18010443]
[4.3.0-42]
- Release domain running lock correctly
  When the domain dies very early by:
  VmError: HVM guest support is unavailable: is VT/AMD-V supported by your CPU and enabled in your BIOS?
  We don't release release the domain running lock correctly.
  Signed-off-by: Zhigang Wang 
  Signed-off-by: Adnan Misherfi 
[4.3.0-41]
- x86/pci: Store VF's memory space displacement in a 64-bit value
  VF's memory space offset can be greater than 4GB and therefore needs
  to be stored in a 64-bit variable.
  commit: 001bdcee7bc19be3e047d227b4d940c04972eb02
  Acked-by: Adnan Misherfi 
  Signed-off-by: Boris Ostrovsky 
[4.3.0-34]
- Test if openvswitch kernel module is loaded to determine where to attach the VIF (bridge or openvswitch) [bug 17885201]
[4.3.0-33]
- Signed-off by: Zhigang Wang 
  Signed-off by: Adnan G Misherfi 
[4.3.0-32]
Add following upstream commits:
    - 2cebe22e6924439535cbf4a9f82a7d9d30c8f9c7
      (libxenctrl: Fix xc_interface_close() crash if it gets NULL as an argument),
    - dc37e0bfffc673f4bdce1d69ad86098bfb0ab531
      (x86: fix early boot command line parsing),
    - 7113a45451a9f656deeff070e47672043ed83664
      (kexec/x86: do not map crash kernel area).
[4.3.0-31]
- Signed-off by: Adnan G Misherfi 
  Signed-off by: Zhigang Wang 
| Release/Architecture | Filename | sha256 | Superseded By Advisory | Channel Label | 
| Oracle VM 3.3 (x86_64) | xen-4.3.0-55.el6.47.33.src.rpm | 595942ef34a1301097bb31a535c956a4224dc8010176f9a319ec7ccb56412dce | OVMBA-2024-0012 | ovm3_x86_64_3.3_patch | 
| xen-4.3.0-55.el6.47.33.x86_64.rpm | 6c8c3cf096f668fecd6dc69c471ebe889de54640cbdcb98abc5310e51e34c825 | OVMBA-2024-0012 | ovm3_x86_64_3.3_patch | |
| xen-tools-4.3.0-55.el6.47.33.x86_64.rpm | 3dd03379910fc7028c0af83068cf1458d81c44928cd3469cdfa94849beda99a9 | OVMBA-2024-0012 | ovm3_x86_64_3.3_patch | |
This page is generated automatically and has not been checked for errors or omissions. For clarification or corrections please contact the Oracle Linux ULN team